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FORM PTO-1 
(REV. 11-98) 



DEPARTMENT OF COMMERCE 
ENT AND TRADEMARK OFFICE 



TRANSMITTAL LETTER TO THE UNITED STATES 
DESIGN ATED^SeCTED OFFICE (DO/EO/US) 
CONCERNING A FILING UNDER 35 U.S.C. 371 



ATTORNEY'S DOCKET NUMBER 
20177YP 



U.S APPLICATION^ ^kifflvsA, &c 37 C 

09/ 62E9Rfiv/ 



CFR 1 5) 



INTERNATIONAL APPLICATION NO. 
PCT/US99/03790 ^/ 



INTERNATIONAL FILING DATE 
22 February 1999 



PRIORITY DATE CLAIMED 
25 February 1998^ 



TITLE OF INVENTION 

BESTS MACULAR DYSTROPHY GENE 



APPLICANT(S) FOR DQ/EO/US ^ 

Konstantin Petrukhin, C. Thomas Caskey and Michael Matzker 



Applicant herewith submits to the United States Designated/Elected Office (DO/EO/US) the following items and 
other information: 

1. [X) This is a FIRST submission of items concerning a filing under 35 U.S.C. 371. 

2. □ This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371. 

3. This is an express request to begin national examination procedures [35 U.S.C. 371(0] at any time rather than 
delay examination until the expiration of the applicable time limit set in 35 U.S.C. 371(b) and PCT Articles 22 
and 39(1). 

4 A proper Demand for International Preliminary Examination was made by the 19th month from the earliest 

claimed priority date. 

5 - (X] A copy of the International Application as filed [35 U.S.C. 371(c)(2)]. 

a. [X] is transmitted herewith (required only if not transmitted by the International Bureau). 

b. Q has been transmitted by the International Bureau. 

c. Q is not required, as the application was filed in the United States Receiving Office (RO/US). 

6. [^] A translation of the International Application into English [35 U.S.C. 371(c)(2)]. 

7. \^\ Amendments to the claims of the International Application under PCT Article 19 [35 U.S.C. 371(c)(3)]. 

a. |^| are transmitted herewith (required only if not transmitted by the International Bureau). 

b. \^} have been transmitted by the International Bureau. 

c. have not been made; however, the time limit for making such amendments has NOT expired. 

d. \^\ have not been made and will not be made. 

8 - EH A translation of the amendments to the claims under PCT Article 19 [35 U.S.C. 371(c)(3)]. 
9. [Xj An oath or declaration of the inventor(s) [35 U.S.C. 371(c)(4)]. 

0 I I A translation of the annexes to the International Preliminary Examination Report under PCT Article 36 
— [35 U.S.C. 371(c)(5)]. 

terns 11 to 16 below concern other document(s) or information included: 

1. [^] An Information Disclosure Statement under 37 CFR 1.97 and 1.98. 

2. Q An assignment document for recording. A separate cover sheet in compliance with 37 CFR 3.28 and 3.3 1 

is included. 



3. \^\ A FIRST preliminary amendment. 

| | A SECOND or SUBSEQUENT preliminary amendment. 

4. \^\ A substitute specification. 

5- A change of power of attorney and/or address letter. 

6- HZ] Other items or information: 
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Computer generated form "TransLetter (DO-EO-US)" (PCT Folder), Merck & Co., Inc., 12/29/99 



U.S. APPLICATION NO (If known, sec 37 

CFR,5) 09/622 964 


INTERNATIONAL APPLICATION NO. 
PCT/US99/03790 


ATTORNEY'S DOCKET NUMBER 
20177YP 


17. [X] The following fees are submitted: 

BASIC NATIONAL FEE [37 CFR 1.492(a)(l)-(5)]: 

Neither international preliminary examination fee (37 CFR 1 482) 
nor international search fee [37 CFR 1 445(a)(2)] paid to USPTO 

and International Search Report not prepared by the EPO or JPQ $970.00 

International preliminary examination fee (37 CFR 1.482) not paid to 

USPTO but International Search Report prepared by the EPO or JPQ .... $840.00 

International preliminary examination fee (37 CFR 1 482) not paid to USFTO 

but international search fee [37 CFR 1 445(a)(2)] paid to USPTO *6VU.uu 

International preliminary examination fee paid to USPTO (37 CFR 1 482) 

but all claims did not satisfy provisions of PCT Article 33(l)-(4) $670.00 

International preliminary examination fee paid to USPTO (37 CFR 1 482) 

and all claims satisfied provisions of PCT Article 33( 1 )-(4> - $96.00 

ENTER APPROPRIATE BASIC FEE AMOUNT = 


CALCULATIONS 


PTO USE ONLY 




$96.00 




Surcharge of $130.00 for furnishing the oath or declaration later than D 20 1 1 30 
months from the earliest claimed priority date [37 CFR 1.492(e)]. 


$0.00 




Claims 


Number Filed 


Number Extra 


Rate 




Total Claims 


18 -20 = 


0 


X $18.00 


$0.00 




Independent Claims 


8 -3 = 


5 


X $78.00 


$390.00 




Multiple dependent claim(s) (if applicable) 


0 


+ $260.00 


$0.00 




TOTAL OF ABOVE CALCULATIONS = 


$486.00 




Reduction by 1/2 for filing by small entity, if applicable. Verified Small Entity 
Statement must also be filed. (Note 37 CFR 1.9, 1.27, 1.28). 






SUBTOTAL = 


$486.00 




Processing fee of $130.00 for furnishing the English translation later than Q 20 
| | 30 months from the earliest claimed priority date [37 CFR 1.492(f)]. 


$0.00 




TOTAL NATIONAL FEE = 


$486.00 




Fee for recording the enclosed assignment [37 CFR 1.21(h)]. The assignment must be 
accompanied by an appropriate cover sheet (37 CFR 3.28, 3.31). $40.00 per property. "** 






TOTAL FEES ENCLOSED = 


$486.00 






Amount to be 
refunded 




charged 





a. Q A check in the amount of $ to cover the above fees is enclosed. 

b [X] Please charge my Deposit Account No. 13-2755 in the amount of $486.00 to cover the above fees. 

A duplicate copy of this sheet is enclosed. 
c [X] The Commissioner is hereby authorized to charge any additional fees which may be required, or credit any 

overpayment to the Deposit Account No. 13-2755 . A duplicate copy of this sheet is enclosed. 



NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1.495 has not been met, a petition to revive 
[37 CFR 1.137(a) or (b)] must be Filed and granted to restore the application to pending status. 

SEND ALL CORRESPONDENCE TO: 

MERCK & CO., INC. 
Patent Department, RY60-30 
P.O. Box 2000 
126 East Lincoln Avenue 
Rahway, New Jersey 07065-0970 

DATE: A^v>Sr fc3 , ZO^Q 

PHONE #: (732) 594-6734 38,413 

REGISTRATION NUMBER 




Joseph A. Coppola 
NAME 
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DT17Rec'dPCT/PT0 3 1 JUL 2002 




PATENT 

09/5; 



^ oeNi ^0£/ IN THE UNITED STATES PATENT AND TRADEMARK OFFICE ( ) 



Applic ants : Petrukhin , et al . 

7£>myp 

Serial No.: 09/622,964 - Case No.: 
Filed: August 23, 2000 

For: BEST'S MACULAR DYSTROPHY GENE 



Art Unit: 



Examiner: 



Assistant Commissioner for Patents 
Washington, D.C 20231 



SUBMISSION OF SEQUENCE STATEMENT 



Sir: 



Please insert the enclosed Sequence Listing in paper and computer readable 
form into the above-identified application. The Sequence Listing does not contain new 
matter. The contents of the paper and computer readable forms of the Sequence Listing are 
the same. 



mm « mm m, tosr ma 10 

AODRESSCT BEFORE 5 P.M. ON THE ABWE DATE Wt 

AN ENVELOPE ADDRESSED TO ASSISTANT GOMMtSSWPI 
FOR PATfNTS, WASH)N£T0N ; O.C., 20234. 

DATE ^31'^ 



Date: 



#4^ 



Respectfully submitted, 




me M. Giesser 
No. 32,838 
Attorney for Applicant 

MERCK & CO., INC. 
P.O. Box 2000 

Rahway, New Jersey 07065-0907 
(908) 594-3046 
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SEQUENCE LISTING 



<H0> Petrukhin, Konstantin 
Caskey, C Thomas 
Metzker, Michael 
Claes, Wadelius 

<120 > BEST'S MACULAR DYSTROPHY GENE 



<130> 20177YP 

<140> 09/622,964 
<141> 2000-08-23 

<150> PCT/US99/03790 
<151> 1999-02-22 

<150> 60/122,926 
<151> 1998-12-18 

<150> 60/075,941 
<151> 1998-02-25 

<160> 31 

<170 > FastSEQ for Windows Version 4.0 

<210> 1 

<211> 16125 

<212> DNA 

<213> Homo Sapiens 



<220> 

<221> mifc.feature 
<222> (!)••• (16125) 
<223> n = A.T.C or G 



<400> 1 „ „ttctcttgg gggttggggc ^caagcggg ^gagggc ^tttgggca 
ccaaaaaatt gttctctuyy =„^ rtttaa caccttaggt tggtuay tatataccta 
aattggctta "gccacgca agggcttt^ gtaa tgcaccatca tgtat^ 

ggcaacccac catggcacac g t atC ccaggac "tagagtg tcagagag 

tgtaaccaac ctggtacatt ctgca y cat agttaac scttagtaca t aa 
ggtgtgtaga aaaatcacct _ g « gaga |? gggtttaaga cacaaggtca tatta 
^aaaftS Scccaaaac cacacatc c ataatcccct g 



ggtgtgtaga aaaatcacct gga * gagag ? ggg tttaaga cacaagg « gca £ gc ttg 
Igagggtgac aggaaaggga g |^g 9 cacacatc tc ataatcccct g gaaaact 
tcagggcttc tggaagttta | ggccacaga ctcaga ctct || g ^agat accactgctg 
attaaaatgc aacatcccta yy t acg g C ctcta aagacca y gaca cccacc 

gcccgtttaa taaacatttg |^cg acaaga ggag ts^aggta g tc 

cctagagctc tgctctcttc _«9 at ggattgat ^attaaaat ac 

acttccaaca ^ttaggaga gcc g gggatc tggg "tttattcc ccaccagact 
acatgctgag attt ^^ a q Cat aaactc cagggctgtt ctgtcaa tactga 
tggctggctg S^ccagca |« tccttccatc ^ctctgaagc caga 

cacccccctc caccagcccc gg «» aaC gtatgat S^caccagc « a 

tgggccctgc cagccaatca gaattc £ gtc at tttactag ^gatgaaa cgggg 
gctcctcgtc agcatatgca _ gctgagagag sagctgaaa t 

Icaccatcct tttcagataa 3J9 agaaaccagg ^gttgact g & g 

tcaccacaca caggtggcaa gg _ g t t caaagacccc ag ?^"ccc taacccccaa 

'"SSS SSl==«t McctctM. 



iiil SsS s5» — " S! " ot " "°~ 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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ccctccaaga 

gattacaaga 

gtgacggcgg 

ggccacagag 

gacccacctg 

tgaggccttc 

agcacatgcg 

agcttggtgt 

aagaatggag 

gaaggtctct 

gaggagccgc 

ttccaggggg 

tcgagagaaa 

ctctctcaaa 

ggtagnnnnn 

ggagcttcca 

agtagtgata 

gcagtggggc 

agtgggaagg 

ggtggaaagg 

gttccattat 

ctcacactag 

gcctagccca 

ttgtattatt 

gcttgagtgc 

ttctcctgcc 

atttttttgt 

tcctgacctt 

gccactgcgc 

cgtcttccct 

aaaaggcagt 

atgggagagt 

atccctacaa 

cccactgcct 

ccttctcccg 

tcctaatctt 

gctgggccgg 

cctcccagcc 

accagctcct 

cacactgaga 

aggaggaccc 

tcccacacag 

gagagagatg 

tccccctgcc 

aaccacatgg 

caggctggag 

gcgattctcc 

ggctaatttt 

gctggtcttg 

attacaggtg 

attaaaatgt 

ggggccaagg 

cccgtctcta 

agctacttgg 

agctatgatc 

ataatgttta 

aaaagcatta 

aataatgcta 

agttagggcc 

ggaggatcac 

tctctataaa 

tttgggaggc 



agaaattaga 

agggactaag 

ctcagcactc 

tcccagggag 

gaaccccacc 

agggttggat 

caggtgtgtg 

gcatcaggag 

ggggcatcaa 

tctttcgata 

ccctcaaaaa 

agctcgaacc 

aggagcggcc 

aagatgaaga 

nnnnntgctt 

ttctgaggag 

agtgctctct 

tctatttcca 

aaccatgcag 

ccctggagcc 

gggaatggga 

ggtggttgag 

gaggacctga 

tatttattta 

aatggcgtga 

tcagcctcct 

atttttagta 

aggtgatcca 

ccagtgatta 

gccaaagcaa 

cagaactggc 

tgaggtccag 

acccccaatc 

ggccatgacc 

cctgctgctg 

cctgctctgc 

ggggcctggg 

agctcagggc 

gggcactgga 

ggctgctcaa 

ctggagccca 

actcataggc 

gagtctcact 

ttagcctccc 

tacttttttt 

tgcagtgggg 

tgccttagcc 

tttttttttc 

aacccctgac 

tcagccacca 

ttatctaagg 

tgcggggatc 

ccaaaaattt 

gaagctgagg 

acaccactgc 

tctaaacggt 

atgattacat 

gcaacaaggc 

agccacaggg 

ttgagcccag 

aaattaaaaa 

cgaggcgggt 



ggggccatgg 

acaaggactc 

acgtgggcag 

tcccaccagc 

tgtgagtaca 

ggccatcttg 

ggcacctgtg 

cgacagccag 

tcactgacaa 

gaagaaggga 

ataagggagg 

ctttagaggg 

gcccaaaaaa 

ggaagccgga 

cagtaaattt 

gaaacaggca 

agaaatatca 

ggttggatgg 

gtatctcagg 

accattcagt 

atatggtggt 

agagcttggg 

gcttagtgtg 

tttattgatc 

tctcagctca 

gagtagctgg 

gagacagggt 

cctgcctcga 

tagaaagtta 

agggcagcct 

agggccttgg 

agcagggaag 

ggtgtccctc 

atcacttaca 

tgctggcggg 

tactacatca 

aaggatgtgg 

ccagtgcacc 

gctgaggctg 

gccaggccag 

ggctttgtct 

ccacatagta 

gtgttgtcca 

aaggggctgg 

tttttttttt 

gcaatcttgg 

tcctgagtag 

tgtattttta 

ctcaagtgat 

tgcacagccc 

ccagtagcag 

acttgagcct 

aaaaaattag 

tgtggggatg 

acttcagcct 

aaggtataat 

ggattgtaaa 

acatttggtt 

gctcacacct 

gagtttagga 

attggctagg 

ggatcatgag 



ccaggctgtg 

ctttgtggag 

tgccagcctc 

ctagtcgcca 

aggtgcccca 

cgtatttgtg 

tgtctgtgca 

ccagtgtggc 

aattatttat 

gagagggggt 

gaggacccaa 

agcgtgggag 

tatccctccc 

gttgtatgtg 

ttattgagcg 

ggaaacaggc 

agcaaggtga 

ttgggaacat 

aagagcttcc 

aaacatcatt 

ggacagggct 

agctaacgaa 

tagacattgc 

ttaagacaga 

ctgcaacctc 

gattacaggc 

ttcaccatgt 

cttcccaaag 

aaggcacatg 

ctgggctcac 

agaccacttc 

ggtcctgaca 

tctaccagga 

caagccaagt 

gcagcatcta 

tccgctttat 

ctggggctgg 

agtccac tac 

cgcgctgggg 

cagggtttta 

ggccccactc 

cattaaaaaa 

ggctggtctc 

gattacaggt 

ttttttgaga 

ctcactgtaa 

ctggaattat 

gtagagacag 

ccacccacct 

acatggtaca 

tgactcgcgt 

gggagttcag 

ctgggagtgg 

gctgaagcct 

gagtgacagg 

cacagaatat 

atatcaaata 

tttactaggg 

gtaatcccag 

cctgagcaac 

ccctttggct 

gtcaggagtt 



ctagccgttg 

gtcctggctt 

taagagtggg 

gaccttctgt 

ggtggactgg 

tgggatatgc 

aatgccctga 

tgcagcaaaa 

agagctcccc 

ttgtccttat 

gaccccgtgg 

aaccgctgta 

gggcgataag 

ttgatatttt 

ccttctacga 

agatatcctg 

ggagacacag 

cct ttctaaa 

tccaggcagg 

tgagcatctc 

gcctggtccc 

caagatgggc 

tgctgttact 

gttttgctct 

cacctcctgg 

acccgcacca 

tggccaggct 

tgctgggatt 

gcaatgcaca 

tttcttgcgt 

atccacctcc 

ggctctgacc 

cccaagccca 

ggctaatgcc 

caagctgcta 

ttataggtaa 

gagctgggag 

aacactaagc 

gctgggcaga 

gccacccttc 

tactggcctg 

gagagagaga 

gaactcctag 

gtgagctact 

cagggtttca 

cctctgcctc 

aggcacacac 

ggtttcatca 

cggcctccca 

ttttttaaaa 

ctgtaatccc 

cgtgggcaac 

tggcatttgc 

gtgaggtcga 

ctatctcaaa 

atgatagcat 

catgaaattc 

caccaaggta 

cactttggga 

atagggagat 

tacacccgta 

caagaccagc 



cttctgagca 

agggagtcaa 

caggggcact 

gggatcatcg 

gctggggctt 

acacacaggc 

ggtgggaatg 

cacacaggga 

ctaaaaaaaa 

aaatataagg 

gttgtgtgtt 

ttcaggcctc 

aaatggtggc 

taaaactcca 

gaacacaaga 

tataatttca 

agcaccggtg 

gggaacctgg 

aagatcagca 

taccagctag 

ttccatactt 

tgagaacact 

gcctttgtcg 

tcttacccag 

gatcaagcga 

cgcctggata 

ggtctcgaac 

ataggcatga 

cgcctatcta 

ttctacttcc 

tagggtccct 

agggcctctg 

cctgctgcag 

cgcttaggct 

tatggcgagt 

agctggcagg 

ctcctggggg 

tgggctcctg 

gtaaagaagt 

ctccaacccc 

ttttactgaa 

gagagagaga 

gctcaagcaa 

gcacttgacc 

ctccatcacc 

ccaggtgcaa 

caccacgcct 

tgttggccag 

aagtgctggg 

ttatttttta 

agcactttga 

atagtgagac 

ctgtggtccc 

ggctgcagtg 

agcaaacaaa 

tttaaattga 

ttgtgttctt 

ctttaaaaaa 

ggccaaggca 

cctgatcttg 

atcccagcac 

ctggccaaca 



1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 
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tagtgaaccc aatctctact ataaatacaa aaattagccg agtggggtgg cacgcacctg 5100 

tagttccagc tactcaggag gatgaggccg gagaatcgct tgagcccggg aggcagaggc 5160 

tgcagtgagc cgagaccatg ccattgcact ccagcctagg tgacagagtg agactccgtc 5220 

ttaaaataat attaaaatct taaaatgatc tgggcatggt ggcttatgcc tgtagtccca 5280 

cccagctctt caggaggctg aagcgggagg attgcttcag cccaggaggt tgaggctgca 5340 

gtgagtcatg actgtgccgc tgcccttgag cctgggtaac agagcaagac cctatctcaa 5400 

aacaaacaaa caaacaaaca aacaaacaaa aaccaataaa ccaaaaacat ttatctaaac 5460 

aataaaataa aggacagata taatcaccga atatatgata gcattttaaa ttgaaaaagc 5520 

actaatgact acaatggatt ataaaacatc aaatacataa aattcttaag ttcctcctaa 5580 

taccaaatac aaagcacatt ggtctttggt ttttacttgg gcaccaatgc atgctgaaaa 5640 

agagtcgttc attttttaga gtagttttag gttcacagca aaattgagca gaaggtagag 57 00 

ttctcatgtg tctctttgct cctccccctg cccccagcct ccccactatc aacaccccca 5760 

cactacagtg gtagatttat tacaatccct gaacccacag tgacacatca ctatcaccca 5820 

aagttcatag cgtacagcag ggttcactct tgggcagtac attccatggg tttggataaa 5880 

tgtgtaatga tgtctccacc atcacagcat caggcagagt agtttcactg ctctaacaaa 5940 

atcctctgcc tattcacccc tctcattaaa gccaaacact ctgtttcctt ttttcctttt 6000 

agagacagtg tctcgctctg tcaaccaggc tgaagtgcaa tggcaatcac agcccattgc 6060 

agcctccaac tcctgggctc aagtgatcct cctatctcag cctccagtgg ctacgactgc 6120 

aggcatacgg caacggcacc caactaattt tttgtagaga tagggtcttg ctatgttgcc 6180 

caggctggtc ttgaactctt ggtcctgcct tagcctccca gagctctggg attacaggcg 6240 

tgaaccaccg tgcccgtccc aaacactctg tttcgacctg cttttaaaca actgaccctt 6300 

ggatgcattc aaaggatcag ggtgtctgaa actggcctct gcagcaggac cttccttcct 6360 

acacatctcc cagtggccag tgtgaggatt ctccccacaa gaaaccactg gagggggcct 6420 

cctcctgtcc gggtttgggg ctgtacaagg agcatcatgg acctggctca ggcctcagga 6480 

ggggccctgg gctggggaaa atgtgggata gcatcgaggc agtcccactc ctacccaggg 6540 

ccgggctaga cctggggaca gtctcagcca tctcctcgct gcgtccacac aattccaccc 6600 

ccacccccac ccccaggctg gccctcacgg aagaacaaca gctgatgttt gagaaactga 6660 

ctctgtattg cgacagctac atccagctca tccccatttc cttcgtgctg ggtgagttcc 6720 

cccttctggc tgttccgggt ccctgtggcc gcccaggctc cagacaggcc aggggaggat 6780 

cacgaggagc tgcggcaagg ggctggggag ggggcggggg aacgccagcg gcaggtcggc 6840 

gcctctctgt agggaaaggt gcggactgca gccagagaaa ctgaagttag acgttaggta 69 00 

agacgtcctg ccgttagcaa tgaaaacccc attttctgag ggaagcgctg acatcatggt 6960 

ccctggagcc cc tgcgcggg aggggagggg gtctggcgga tttctgggac cagcaggggg 7 02 0 

acccccgggt gacagaaccc ttggggctct cgcgcctcca tgcgaggctc tgcctgcctc 7080 

tcgctcccga gcgccttcca ggagggctgg gggctaggcc cgctcgcagc agaaagctgg 7140 

aggagccgag gcatcgccgg gcgctgggcc ctgggctctg gccgcagcct ggcccctcgc 72 00 

ccctcgcccc ccgcccctcc tgcccaggct tctacgtgac gctggtcgtg acccgctggt 7260 

ggaaccagta cgagaacctg ccgtggcccg accgcctcat gagcctggtg tcgggcttcg 7320 

tcgaaggcaa ggacgagcaa ggccggctgc tgcggcgcac gctcatccgc tacgccaacc 73 80 

tgggcaacgt gctcatcctg cgcagcgtca gcaccgcagt ctacaagcgc ttccccagcg 7440 

cccagcacct ggtgcaagca ggtgggcgga ccgggagcaa cggggaggca ccgggcagag 7500 

ccaggggccg agatgggcgc ggcaggaacg gaagatgggt ggagccaaag tcccccggac 7560 

tcgggggact gggtggagcc aggagtgggg tgtggtcaag atttgggggt ccaattgggc 7620 

gggacagagt cgggtgtctg aaggtggggc gaggccagga gcccaccctc cgagagtagg 7680 

agtctgaggc agggctaagg acccttgagg gataatggaa agaagggtga cggcttggga 7740 

actggtgagg tactagggtc tacttccctc tgcccttgcc cctcttgatc tccggtttcc 7800 

actctggagg tatgggacat tggtctctga caccccctca gcctggcctg acctggtcct 7860 

ggttaataag acagacccag gctaggcgtg gtggctctcg cctgtaatcc cagtgcttta 7920 

ggaggcaaag gtgggaagat cgcttgagcc cagctgtttg agacgcccct gagcaacata 7980 

gcgagacccc catctctaca aaaacattaa aaattagcag ggcatggtgg cgtgtgcctg 8040 

tagtctgagg ctgagtatcg ggaggctgag gcaggaggat cacttgagcc cagcagttcc 8100 

aggctgcagt gcgctaagat cgcaccgctg cactccaacc tcggtgacag agccagaccc 8160 

tttctctgga aataaataaa taccctgccc acatgctcag cccagaacag cacctagtag 8220 

gtgctcagaa atttttttgt tgttgaaaga aagaggatgg caaaggagtg ctgaggttcc 8280 

tataggtcag caggtgccgg ccatcccttc tgcaggttct cccacccacc gccttcttca 8340 

ctccactctg caggctttat gactccggca gaacacaagc agttggagaa actgagccta 8400 

ccacacaaca tgttctgggt gccctgggtg tggtttgcca acctgtcaat gaaggcgtgg 8460 

cttggaggtc gaatccggga ccctatcctg ctccagagcc tgctgaacgt gagcccactg 8520 

tacagacagg gctgccgcag agtgggaagg gttgtggtcc acaggaaaca aggtttccta 8580 

caaagagaag ccttgggccc ctgagggtct tccgagagcc ggaggtgggg ttgcagaatc 8640 

ttttccaaca gcaatccaca gcccgaggtg gtcccttatc agaggcccct ccctcttctc 8700 

caagtctgtg aggtcctggt tcccttttga tagatgagga agctgagaca caaagaggtt 87 60 
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tagtgagctt 

tggagaagag 

ctcttctgcc 

cgactggatt 

cttttgggaa 

agaggctaag 

aagtgccaag 

ttatccccat 

ctttgggaag 

acatggtgaa 

gtaatcccag 

ttgcagtgag 

ctcaaacaaa 

aagtcacata 

cctggagcat 

ctcctcctcc 

tgggcggcag 

gcccgtcttc 

cagggccctg 

gaggatgcag 

cacacctgta 

gcggaggttg 

actctatctc 

cctttagaag 

gtagctgtcc 

ctcttaattc 

agtatgctct 

ccgccccatt 

ttgnnnnnnn 

tccttctgtt 

gggtgtaagt 

acacacacac 

tttggtttct 

cagtggcctg 

tttgttaggg 

ttcagcccag 

gtaagataca 

ctgtgggtgt 

gtgggtgcct 

aggcaggaag 

tggctttgag 

gcagctcatc 

caggaatttg 

gacccaaaga 

atctttgagg 

cagagagagg 

tctttctgtg 

atgcaccagg 

cccccctaca 

aacatcaggt 

tgcctaggaa 

cactgtacta 

ttatcccaag 

acatagctac 

ggaacgttag 

tatctcgctc 

cgcctcctgg 

gcccacaacc 

ggccaggctg 

ggaattatag 

aagtggggtt 

acaggtacct 



cccatggcca 

gtgggggcga 

ccccaggaga 

agtatcccac 

actgaggcta 

tggctcccct 

ttctaagagt 

attaaagaga 

ctgaggcagg 

accccatctc 

ctacttggga 

ctgagatcat 

caaacaaaca 

agtgtgcaag 

cctgatttca 

tcccaggtgg 

tttctgaacc 

acgttcctgc 

ctgggctgga 

tgtcaggaaa 

atcccagcta 

tggtgagttg 

aaaaacaaca 

cagagcgaac 

agtattctcc 

ctcctttgtg 

gcttctgcat 

ttagttggaa 

nnnagatgtt 

gcccacccac 

ctctgtctct 

acacacacac 

gcagatcaaa 

caagatgtcc 

cattttagag 

cttcagtata 

tgaggtgaga 

tcagggaagg 

acaagtgtgg 

ggcttcatgg 

gagttctgcc 

aacccctttg 

caggtatggg 

gagctcctcc 

ctgcaggcag 

gagagattcc 

ggacttcttc 

acctgcctcg 

cagctgcttc 

gtggccagag 

cttagaatag 

tgctctttat 

ttttcagata 

caaatggtgc 

gacctggctc 

tgtcgcccag 

gttcaagtga 

acaactggct 

gtctccaact 

gtgtcaaaac 

ccctgggatg 

cctgaattga 



cacagccagg 

gcccagggtg 

tgaacacctt 

tggtgtatac 

gaaggaccaa 

gggagttggg 

ccaggctcct 

ggttggccgg 

tggatcacct 

tactgaaaat 

ggctgaggca 

gccactgcac 

aacaaacaaa 

tcagaacaag 

gggttcccac 

tgactgtggc 

cagccaaggc 

agttcttctt 

ggcatggcca 

ggaaggtctc 

ctcgggaggc 

agatcgtgcc 

acaacaacaa 

actctcctat 

acacagcata 

ccaccatttt 

tgagacaaaa 

aaaaacttta 

ctgaatcaga 

tctctctccc 

gcccttcctg 

attcctattc 

acaaatcaca 

cctggacccc 

gttgctatcc 

tatctctgtt 

taaaggcagt 

actggctcag 

ggggctggag 

ggtgtggaaa 

tgagggtfcta 

gagaggatga 

gagagggaga 

ctcctgcagc 

gcacccatct 

tccaagtcat 

tgtccctggt 

gatggagccg 

cgcccagttc 

ccagggggct 

cactagttaa 

aaacattaac 

attaaagtac 

atttgctact 

ttgtcatcca 

gttggagcgc 

ttctcctgct 

aatttttgta 

cctgaccagt 

tatgttttct 

ggggaggggc 

ctttgtccta 



aatggaccat 

ggggcaggtg 

gcgtactcag 

acaggtgagg 

ggaagcagct 

tccacacttt 

gcctggccca 

gcacagtggc 

gaggtcagga 

acagaattag 

ggagaatcgc 

tccagcctgg 

caaacaaaca 

gccttggtct 

ctagcccttt 

ggtgtacagc 

ctaccctggc 

ctatgttggc 

gaggggtcat 

acgggtagaa 

tgaggcagga 

actgcactcc 

aacaaagccc 

taagatgctg 

atcgacagat 

ttcttctacc 

tacagagaga 

ttaaatcagg 

gagttttctc 

ttcctacctt 

tcactgtgac 

ctctaaattc 

cttttatgct 

taaggcagac 

aggaatctgc 

gcatgaatga 

gactcagccg 

aagagttaga 

ccctaaactc 

tagcagcagc 

cagagcctca 

tgatgatttt 

gaaaccatac 

cagtcattca 

ccccatttca 

caggcacata 

gaccaggtgt 

gacatgtact 

cgtcgagcct 

gggtgggaag 

tgcatacagg 

tatttttttc 

aggttcagag 

cgaaggacag 

gaactatgtt 

agtggcgtga 

tcagcctccc 

cttttagtag 

aatctgcccg 

gataagctac 

agcaaagtcc 

ccgagtaaag 



aggtaccagg 

gtgttcagaa 

tgtggacacc 

actaggctgg 

ggggtgggaa 

gaagttgggt 

gtccagtaga 

tcatgcctgt 

gttcgagacc 

ctgtgtggtg 

ttgaacccgg 

gcgacacagc 

aaggggttaa 

cctgtctcag 

gctaccacat 

ttcttcctga 

catgagctgg 

tggctgaagg 

ggccagcagc 

agcagccagg 

gaatcgcttg 

agcctgggca 

taaggttcag 

ttgggtgtct 

tctaatacaa 

tcctaattta 

gagaaagatc 

caagtaaaat 

tcgagctctt 

cctttatttt 

acacacacac 

cccctgcacc 

tgaaattctc 

gcgtgtcacc 

ccacctagac 

ataaaattat 

agtgatacac 

ggggctgtgt 

tgcctttgaa 

tgaggtttaa 

cctgtcccca 

gagaccaact 

catggacctt 

ctcacaggat 

caggcaggga 

caaggtcctg 

ccctgttggc 

ggaataagcc 

cctttatggg 

cccctcctag 

ttgcttcagt 

ctcccaataa 

agagtaagtt 

cctatgatca 

ttcttttctt 

tcttggctca 

cagtagctgg 

agatgaggtt 

ctttggcctc 

gatgcttgga 

cagcaggcag 

ggctcaggcc 



ccctggtacc 
ccccatcccc 
tgtatgccta 
tgaggctgcc 
gggctcacct 
ctggactttg 
ggcaatgtga 
aatcccagca 
agcctggcca 
gtgcacgcct 
gaggtggagg 
aagactctgt 
cagagcccct 
actcccagcc 
cctcctcctc 
cttgtctagt 
acctcgttgt 
tgggcctctc 
tgcttgagac 
cgtggtggcg 
aacccgggag 
aaagaatgaa 
aagcccctgc 
ttttcactca 
atttcttcaa 
tgaatgggtt 
tatcttaatc 
ccgccaagga 
tatctttcct 
ttggtaatgg 
acacacacac 
cccagttatc 
cagggtgccc 
tcttcggggc 
tgccctttag 
gcaactccag 
tcagggacag 
ccagaagtgt 
gacagtggtc 

agggggaagc 

aggtggcaga 
ggattgtcga 
ccccaaagtg 
tctcacctca 
aactgaggtc 
cctgggatga 
tgtggatgag 
cgagccacag 
ctccaccttc 
tgcaggggtc 
aagtgtcagg 
ttctggtttg 
gtccaaggcc 
gtgatgcagt 
tttgagacag 
ctgcaacctc 
gattacaggt 
tcaccatgtt 
ccaaaatgct 
tgggaagtgg 
ccaggccatc 
acccacagca 



8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
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gccagactta 
cagtgggctg 
gccttggaga 
cagtgctgtt 
gcccagagct 
caagcaattc 
cccggctaat 
gaactcctga 
gtgagccact 
tcgagtttaa 
ggctggaggc 
ctgtacccca 
gcatagacct 
aactctggat 
agtcatcttt 
atgggaagaa 
tgcagggtaa 
gtggatgtca 
cctcatctga 
ttacagggga 
tggatgttaa 
gatcaccggg 
gacggtgtgg 
ggccaggtgt 
tgaacaaaga 
gcatcattgg 
caaggaccaa 
accacaaggc 
aggctgtgga 
cacagacgcc 
agcttcacag 
gggccaagaa 
aagtatctca 
ccgaaaatca 
aagatcacat 
ggcactgcat 
acaatttcct 
ggggtatata 
cttcaccctg 
ctgaccagaa 
atgttatcac 
gcagtcagct 
actacaggaa 
agggtgtcca 
ttgcttcctt 
tttcctggac 
aagtgtaact 
ctattatgat 
gaacttggca 
ttgggctggg 
caacatgatg 
gcctgtagtc 
tgcagtgaga 
aaaaaaaaaa 
tgaaggtacc 
ttcctaacct 
cagcaggaca 
aacagcctga 
atgcctttta 
gactctggat 



tccccacatg 
aagggctatc 
gtgttgggca 
ctcgcttgtt 
ggagtgcagt 
tcctgcctca 
ttttgtattt 
cctcaggtga 
gtgcctggct 
ctacagtcta 
ttgcttaggt 
gcagctgaag 
tgtctccaag 
acaaggtaca 
gtctgcctgg 
tctgaacttg 
aggtctgggg 
ctcccagttg 
accactcatg 
agtgaatgat 
ctctggtcaa 
aggtacagta 
ccttggcttg 
tggtcctttg 
ggagatggag 
ccgcttccta 
actactgtgg 
agccaaacag 
cgccttcaag 
cctcagcccc 
tgtcacaggc 
aagttttgaa 
agtgaggagg 
cctcaaagaa 
ggatccttat 
tgccctgtgc 
agggttccat 
cttggccacc 
gtatcacccg 
tgctgctgga 
tggccccaac 
ggatgacaga 
agggtggcag 
gcgcccttag 
ggggtcagcc 
actgtcaccc 
tcccttttct 
tgaaacctta 
aacatctgtg 
tgtggaggca 
aaaccccatc 
ccaacgcagg 
ttgagcaact 
aggatcgtct 
ttccatactt 
gcttcctaat 
ctgatccagt 
atcaaatggt 
ttcataaaaa 
tcagagtcgg 



gtcccacttc 
ccagctggtc 
catgtcaggg 
cttttctttt 
ggcataatct 
gcctcctgag 
ttagtagaga 
tccaccctcc 
gcttgttctt 
tagatactgt 
cagaaacatt 
gttgttgagg 
gaatgcacaa 
aagtactgga 
gaggcggctt 
ggcaagggca 
ctcccgggat 
gaaccacaaa 
ccagggcacc 
gaggaggcct 
gagggaatca 
cagatcagga 
ggccaactga 
tccactggct 
ttccagccca 
ggcctgcagt 
cccaagaggg 
aacgttaggg 
tctgccccac 
actcccatgt 
atagacacca 
ttgctctcag 
aaaactgtgg 
cctttggaac 
tgggccttgg 
cccaccccag 
cactgccaga 
ttcacaggga 
gaagacttct 
gaactgcccc 
ttactttgag 
tgaacacttc 
gaactgcctc 
gtcattttct 
caaagctgtc 
aattataaac 
ggattctcaa 
aaagggcaac 
gcctgttcag 
agtgaatcac 
tctaccaaaa 
aggttgaggg 
gcaatccagc 
caacctttgc 
atgctgttaa 
ggggatgctt 
cacagccata 
tagcttaata 
ctgtgaaagc 
gaacccttag 



cctgattcca 
ctttctcccc 
ttcatactca 
ttttttttta 
cggctcactg 
tagctgggat 
cagtttcacc 
tcagcctccc 
ttaagaacca 
gaggaatggt 
tctggaggat 
gatggggagg 
tttatggagg 
tgtccagaaa 
ccagctgggt 
ggccatactc 
gcctgttgct 
ttcctggcat 
agtgtttctg 
ttacacgcca 
acaaacagtg 
gagaggtgag 
gagagaggag 
cagccctgca 
atcaggagga 
cccatgatca 
aatcccttct 
gccaggaaga 
tgtatcagag 
tcttccccct 
aagacaaaag 
agagcgatgg 
agtttaacct 
aatcaccaac 
aaaacaggtc 
cttcccttgc 
gcacactgga 
tcctagggaa 
tgggaccagg 
agggctgaca 
caagggtggc 
ccccataact 
actcctagga 
cactgcctgg 
acaaaatcag 
accccacttc 
gcagttactt 
aatttcantc 
caaaggatgt 
aggaggtcag 
aaaatacaaa 
gagaattgct 
ctgggcgacg 
cc tec tactg 
tactttcatt 
cgccagccag 
cagctgtcca 
gataaaaatc 
tagactgaac 
ttctatctga 



tctgaatccc 
aggacaacag 
agggtttctt 
aacggagttt 
caacctccgc 
tataggtgee 
atgttggcca 
aaagtgctgg 
aatatcctac 
tgggaaggtc 
gactttgagc 
gctgaaaaca 
gagctcaaac 
agggacagaa 
ctggagctga 
tctggtagat 
aggaagtcaa 
tgcccagagt 
actgcctgga 
ggcggggtgg 
aggtgagctg 
agctggggca 
egggggtaag 
tctcctgttt 
cgaggaggat 
ccatcctccc 
ccacgagggc 
caacaaggcc 
gccaggctac 
agaaccatca 
cttaaagact 
ggccttgatg 
gaeggatatg 
caacatacac 
tgtcctccac 
tetgagecta 
cctacgccca 
gtgttcggga 
tgaaggaaga 
ggecaggett 
tgacccaaaa 
atttagggta 
actggtagat 
gaacctcacc 
atatttccct 
agccccaatc 
teaegggtea 
ttgette tag 
tcatatttaa 
gagtttgaga 
tcagctggcc 
tgaacccagg 
gagtgagact 
caacattttg 
ctcactaggg 
gtcctcacct 
cactgaagaa 
ccagactact 
cattggaaac 
atccaagaca 



tcttgagctg 
agttgaaagt 
ccacggtatc 
cactcttgtt 
ctcccagatt 
agccaccaag 
ggctggtctc 
gattacatgt 
tagactgeaa 
atcaaatgaa 
cctacatggt 
gaacgataaa 
ccaagtctca 
catggaacac 
gecatggaac 
aagctttcct 
atttctcttt 
cactcatggg 
gtgaggggtt 
ttgcgggggt 
ggcctggagg 
tggtgaggaa 
ggagaagtaa 
ctttccagcc 
gctcacgctg 
agggcaaact 
ctgcccaaaa 
tggaagctta 
tacagtgccc 
gcgccgtcaa 
gtgagttctg 
gagcacccag 
ccagagatcc 
actacactca 
ctgaaccagg 
cccttcc tec 
gcactggctt 
ccttttctca 
tgaggttgtg 
agctgagcag 
ccatgaggtg 
gtacccaagc 
ggtgaggttg 
aaaatacttc 
ttattccaga 
acgtgggagg 
gaacacgcag 
gctaagacag 
gaatcttgtc 
ccaacctggc 
gtcgtggtgt 
aggtggtggt 
gtctcaaaaa 
gtatttgaaa 
atgaagcaca 
gtgtgtacac 
cgtgtcctac 
teagecttta 
atttaac tea 
gccacacctt 



12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
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agtatactgc ccaaactaat gagtttaata aatacaaata ctcgt 16125 



<210> 2 

<211> 2229 

<212> DNA 

<213> Homo Sapiens 



<400> 2 

cagggagtcc 

ccccacctga 

caagccaagt 

gcagcatcta 

tccgctttat 

ctctgtattg 

tgacgctggt 

tcatgagcct 

gcacgctcat 

cagtctacaa 

cagaacacaa 

tgtggtttgc 

tgctccagag 

cctacgactg 

gcttcttcct 

gccatgagct 

gctggctgaa 

agaccaactg 

accaggacct 

cctacacagc 

tcagcctgaa 

acgctggcat 

caaactcaag 

ccaaaaacca 

agcttaaggc 

gtgccccaca 

cgtcaaagct 

gttctggggc 

acccagaagt 

agatccccga 

cactcaaaga 

aacctgcttc 

ggacactgat 

cctgaatcaa 

ttttattcat 

tggattcaga 

actgcccaaa 

aaaaaaaaa 



caccagccta 
cccaagccca 
ggctaatgcc 
caagctgcta 
ttataggctg 
cgacagctac 
cgtgacccgc 
ggtgtcgggc 
ccgctacgcc 
gcgcttcccc 
gcagttggag 
caacctgtca 
cctgctgaac 
gattagtatc 
gacttgtcta 
ggacctcgtt 
ggtggcagag 
gattgtcgac 
gcctcggatg 
tgcttccgcc 
caaagaggag 
cattggccgc 
gaccaaacta 
caaggcagcc 
tgtggacgcc 
gacgcccctc 
tcacagtgtc 
caagaaaagt 
atctcaagtg 
aaatcacctc 
tcacatggat 
ctaatgggga 
ccagtcacag 
atggttagct 
aaaaactgtg 
gtcgggaacc 
ctaatgagtt 



gtcgccagac 
cctgctgcag 
cgcttaggct 
tatggcgagt 
gccctcacgg 
atccagctca 
tggtggaacc 
ttcgtcgaag 
aacctgggca 
agcgcccagc 
aaactgagcc 
atgaaggcgt 
gagatgaaca 
ccactggtgt 
gttgggcggc 
gtgcccgtct 
cagctcatca 
aggaatttgc 
gagccggaca 
cagttccgtc 
atggagttcc 
ttcctaggcc 
ctgtggccca 
aaacagaacg 
ttcaagtctg 
agccccactc 
acaggcatag 
tttgaattgc 
aggaggaaaa 
aaagaacctt 
ccttattggg 
tgcttcgcca 
ccatacagct 
taatagataa 
aaagctagac 
cttagttcta 
taataaatac 



cttctgtggg 
cccactgcct 
ccttctcccg 
tcttaatctt 
aagaacaaca 
tccccatttc 
agtacgagaa 
gcaaggacga 
acgtgctcat 
acctggtgca 
taccacacaa 
ggcttggagg 
ccttgcgtac 
atacacaggt 
agtttctgaa 
tcacgttcct 
acccctttgg 
aggtgtccct 
tgtactggaa 
gagcctcctt 
agcccaatca 
tgcagtccca 
agagggaatc 
ttaggggcca 
gcccactgta 
ccatgttctt 
acaccaaaga 
tctcagagag 
ctgtggagtt 
tggaacaatc 
ccttggaaaa 
gccaggtcct 
gtccacactg 
aaatcccaga 
tgaaccattg 
tc tgaatcca 
aaatactcgt 



atcatcggac 
ggccatgacc 
cctgctgctg 
cctgctctgc 
gctgatgttt 
cttcgtgctg 
cctgccgtgg 
gcaaggccgg 
cctgcgcagc 
agcaggcttt 
catgttctgg 
tcgaatccgg 
tcagtgtgga 
ggtgactgtg 
cccagccaag 
gcagttcttc 
agaggatgat 
gttggctgtg 
taagcccgag 
tatgggctcc 
ggaggacgag 
tgatcaccat 
ccttctccac 
ggaagacaac 
tcagaggcca 
ccccctagaa 
caaaagctta 
cgatggggcc 
taacctgacg 
accaaccaac 
cagggatgaa 
cacctgtgtg 
aagaacgtgt 
ctacttcagc 
gaaacattta 
agacagccac 
taaaaaaaaa 



ccacctggaa 
atcacttaca 
tgctggcggg 
tactacatca 
gagaaactga 
ggcttctacg 
cccgaccgcc 
ctgctgcggc 
gtcagcaccg 
atgactccgg 
gtgccctggg 
gaccctatcc 
cacctgtatg 
gcggtgtaca 
gcctaccctg 
ttctatgttg 
gatgattttg 
gatgagatgc 
ccacagcccc 
accttcaaca 
gaggatgctc 
cctcccaggg 
gagggcctgc 
aaggcctgga 
ggctactaca 
ccatcagcgc 
aagactgtga 
ttgatggagc 
gatatgccag 
atacacacta 
gcacattcct 
tacaccagca 
cctacaacag 
ctttaatgcc 
actcagactc 
accttagtat 
aaaaaaaaaa 



<210> 3 
<211> 585 
<212> PRT 

<213> Homo Sapiens 
<400> 3 

Met Thr lie Thr Tyr Thr Ser Gin Val Ala Asn Ala Arg Leu Gly Ser 

15 10 15 

Phe Ser Arg Leu Leu Leu Cys Trp Arg Gly Ser lie Tyr Lys Leu Leu 

20 25 30 

Tyr Gly Glu Phe Leu lie Phe Leu Leu Cys Tyr Tyr lie lie Arg Phe 

35 40 45 

lie Tyr Arg Leu Ala Leu Thr Glu Glu Gin Gin Leu Met Phe Glu Lys 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2229 
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50 










55 










D U 










Leu 


Thr 


Leu 


Tyr 


Cys 


Asp 


Ser 


Tyr 


He 


Gin 


Leu 


He 


Pro 


lie 


Ser 


Phe 


65 








7 0 










75 










o u 


Val 


Leu 


Gly 


Phe 


Tyr 


Val 


Thr 


Leu 


Val 


Val 


Thr 


Arg 


Trp 


Trp 


Asn 


r*i t-<, 
bin 








85 










y u 










.7 3 




Tyr 


Glu 


Asn 


Leu 


Pro 


Trp 


Pro 


Asp 


Arg 


Leu 


Met 


Ser 


Leu 


vai 


i^er 


biy 






100 










1 yJD 










i i n 

11U 






Phe 


Val 


Glu 


Gly 


Lys 


Asp 


Glu 


Gin 


Ser 


Arg 


Leu 


Leu 


Arg 


Arg 


Thr 


Leu 






115 










12 0 










1 2 3 








lie 


Arg 


Tyr 


Ala 


Asn 


Leu 


Gly 


Asn 


Val 


Leu 


He 


Leu 


Arg 


Ser 


Val 


Ser 




130 










13 5 










140 










Thr 


Ala 


Val 


Tyr 


Lys 


Arg 


Phe 


Pro 


Ser 


Ala 


Gin 


His 


Leu 


Val 


bin 


Ala 


145 








150 










15 5 










i £ n 


Gly 


Phe 


Met 


Thr 


Pro 


Ala 


Glu 


His 


Lys 


Gin 


Leu 


Glu 


Lys 


Leu 


Ser 


Leu 








165 










17 0 










I/O 




Pro 


His 


Asn 


Met 


Phe 


Trp 


Val 


Pro 


Trp 


Val 


Trp 


Phe 


Ala 


Asn 


Leu 


Ser 








180 










185 










i q n 
i y u 






Met 


Lys 


Ala 


Trp 


Leu 


Gly 


Gly 


Arg 


He 


Arg 


Asp 


Pro 


lie 


Leu 


Leu 


bin 






195 










2 0 0 










one: 
Z U 3 








Ser 


Leu 


Leu 


Asn 


Glu 


Met 


Asn 


Thr 


Leu 


Arg 


Thr 


Gin 


Cys 


Giy 


His 


Leu 




210 










215 










o o n 
2 2 U 










Tyr 


Ala 


Tyr 


Asp 


Trp 


lie 


Ser 


lie 


Pro 


Leu 


Val 


Tyr 


Thr 


Gin 


Val 


Val 


225 










23 0 










1 "3 c: 
2 










O A C\ 

z 4± u 


Thr 


Val 


Ala 


Val 


Tyr 


Ser 


Phe 


Phe 


Leu 


Thr 


Cys 


Leu 


Val 


Gly 


Arg 


bin 










245 










250 










Z j J 




Phe 


Leu 


Asn 


Pro 


Ala 


Lys 


Ala 


Tyr 


Pro 


Gly 


His 


Glu 


Leu 


Asp 


Leu 


Val 








260 










2 65 










o "~! n 
Z / U 






Val 


Pro 


Val 


Phe 


Thr 


Phe 


Leu 


Gin 


Phe 


Phe 


Phe 


Tyr 


1 Tin T 

vai 


Giy 


Trp 


Leu 






275 










2 80 










o o c 
2 o 3 








Lys 


Val 


Ala 


Glu 


Gin 


Leu 


lie 


Asn 


Pro 


Phe 


Gly 


Glu 


Asp 


Asp 


Asp 


Asp 


290 










2 9 5 










inn 










Phe 


Glu 


Thr 


Asn 


Trp 


lie 


Val 


Asp 


Arg 


Asn 


Leu 


Gin 


Val 


Ser 


Leu 


Leu 


305 










310 










3 lr> 










ion 
3 z u 


Ala 


Val 


Asp 


Glu 


Met 


His 


Gin 


Asp 


Leu 


Pro 


Arg 


Met 


/~* Tit 
blU 


Pro 


Asp 


i v iec 








3 2 5 










5 3 U 










T O c 
J J J 




Tyr 


Trp 


Asn 


Lys 


Pro 


Glu 


Pro 


Gin 


Pro 


Pro 


Tyr 


Thr 


Ala 


Aia 


Ser 


Ala 






340 










1 A CT 

3 43 










jjU 






Gin 


Phe 


Arg 


Arg 


Ala 


Ser 


Phe 


Met 


Gly 


Ser 


Thr 


Phe 


Asn 


lie 


Ser 


Lgu 






355 










3 60 










1 £Z C 

3 03 








Asn 


Lys 


Glu 


Glu 


Met 


Glu 


Phe 


Gin 


Pro 


Asn 


Gin 


Glu 


Asp 


blU 


blU 


Asp 




370 










375 










i o n 










Ala 


His 


Ala 


Gly 


lie 


lie 


Gly 


Arg 


Phe 


Leu 


Gly 


Leu 


Gin 


Ser 


His 


Asp 


385 








3 9 0 










1 Q R 

o y o 










Ann 


His 


His 


Pro 


Pro 


Arg 


Ala 


Asn 


Ser 


Arg 


Thr 


Lys 


Leu 


Leu 


Trp 


Pro 


Lys 










405 










A 1 f\ 

4 1 U 










4tl 3 




Arg 


Glu 


Ser 


Leu 


Leu 


His 


Glu 


Gly 


Leu 


Pro 


Lys 


Asn 


His 


Lys 


Ala 


Ala 






42 0 










4Z3 










/I "3 n 
41 3 U 






Lys 


Gin 


Asn 


Val 


Arg 


Gly 


Gin 


Glu 


Asp 


Asn 


Lys 


Ala 


Trp 


Lys 


Leu 


Lys 




a *o a 










a a n 

4t 4t u 










445 








Ala 


Val 


Asp 


Ala 


Phe 


Lys 


Ser 


Gly 


Pro 


Leu 


Tyr 


Gin 


Arg 


Pro 


Gly 


Tyr 




450 








455 










460 










Tyr 


Ser 


Ala 


Pro 


Gin 


Thr 


Pro 


Leu 


Ser 


Pro 


Thr 


Pro 


Met 


Phe 


Phe 


Pro 


465 










470 










475 










480 


Leu 


Glu 


Pro 


Ser 


Ala 


Pro 


Ser 


Lys 


Leu 


His 


Ser 


Val 


Thr 


Gly 


He 


Asp 










485 










490 










495 




Thr 


Lys 


Asp 


Lys 


Ser 


Leu 


Lys 


Thr 


Val 


Ser 


Ser 


Gly 


Ala 


Lys 


-Lys 


Ser 




500 










505 










510 






Phe 


Glu 


Leu 


Leu 


Ser 


Glu 


Ser 


Asp 


Gly 


Ala 


Leu 


Met 


Glu 


His 


Pro 


Glu 






515 










520 










525 








Val 


Ser 


Gin 


Val 


Arg 


Arg 


Lys 


Thr 


Val 


Glu 


Phe 


Asn 


Leu 


Thr 


Asp 


Met 




530 










535 










540 










Pro 


Glu 


lie 


Pro 


Glu 


Asn 


His 


Leu 


Lys 


Glu 


Pro 


Leu 


Glu 


Gin 


Ser 


Pro 
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545 550 555 560 

Thr Asn lie His Thr Thr Leu Lys Asp His Met Asp Pro Tyr Trp Ala 

565 570 575 

Leu Glu Asn Arg Asp Glu Ala His Ser 
580 585 

<210> 4 

<211> 2429 

<212> DNA 

<213> Homo Sapiens 



<400> 4 

cagggagtcc caccagccta gtcgccagac cttctgtggg atcatcggac ccacctggaa 6 0 

ccccacctga cccaagccca cctgctgcag cccactgcct ggccatgacc atcacttaca 120 

caagccaagt ggctaatgcc cgcttaggct ccttctcccg cctgctgctg tgctggcggg 180 

gcagcatcta caagctgcta tatggcgagt tcttaatctt cctgctctgc tactacatca 240 

tccgctttat ttataggctg gccctcacgg aagaacaaca gctgatgttt gagaaactga 300 

ctctgtattg cgacagctac atccagctca tccccatttc cttcgtgctg ggcttctacg 360 

tgacgctggt cgtgacccgc tggtggaacc agtacgagaa cctgccgtgg cccgaccgcc 42 0 

tcatgagcct ggtgtcgggc ttcgtcgaag gcaaggacga gcaaggccgg ctgctgcggc 480 

gcacgctcat ccgctacgcc aacctgggca acgtgctcat cctgcgcagc gtcagcaccg 540 

cagtctacaa gcgcttcccc agcgcccagc acctggtgca agcaggcttt atgactccgg 600 

cagaacacaa gcagttggag aaactgagcc taccacacaa catgttctgg gtgccctggg 660 

tgtggtttgc caacctgtca atgaaggcgt ggcttggagg tcgaatccgg gaccctatcc 72 0 

tgctccagag cctgctgaac gagatgaaca ccttgcgtac tcagtgtgga cacctgtatg 780 

cctacgactg gattagtatc ccactggtgt atacacaggt ggtgactgtg gcggtgtaca 840 

gcttcttcct gacttgtcta gttgggcggc agtttctgaa cccagccaag gcctaccctg 900 

gccatgagct ggacctcgtt gtgcccgtct tcacgttcct gcagttcttc ttctatgttg 960 

gctggctgaa ggtgggcctc tccagggccc tgctgggctg gaggcatggc cagaggggtc 102 0 

atggccagca gctgcttgag acgaggatgc agtgtcagga aaggaaggtc tcacgggtag 1080 

aaagcagcca ggcgtggtgg cgcacacctg taatcccagc tactcgggag gctgaggcag 1140 

gagaatcgct tgaacccggg aggcggaggt tgtggtggca gagcagctca tcaacccctt 1200 

tggagaggat gatgatgatt ttgagaccaa ctggattgtc gacaggaatt tgcaggtgtc 1260 

cctgttggct gtggatgaga tgcaccagga cctgcctcgg atggagccgg acatgtactg 132 0 

gaataagccc gagccacagc ccccctacac agctgcttcc gcccagttcc gtcgagcctc 1380 

ctttatgggc tccaccttca acatcagcct gaacaaagag gagatggagt tccagcccaa 1440 

tcaggaggac gaggaggatg ctcacgctgg catcattggc cgcttcctag gcctgcagtc 1500 

ccatgatcac catcctccca gggcaaactc aaggaccaaa ctactgtggc ccaagaggga 1560 

atcccttctc cacgagggcc tgcccaaaaa ccacaaggca gccaaacaga acgttagggg 162 0 

ccaggaagac aacaaggcct ggaagcttaa ggctgtggac gccttcaagt ctggcccact 1680 

gtatcagagg ccaggctact acagtgcccc acagacgccc ctcagcccca ctcccatgtt 1740 

cttcccccta gaaccatcag cgccgtcaaa gcttcacagt gtcacaggca tagacaccaa 1800 

agacaaaagc ttaaagactg tgagttctgg ggccaagaaa agttttgaat tgctctcaga 1860 

gagcgatggg gccttgatgg agcacccaga agtatctcaa gtgaggagga aaactgtgga 192 0 

gtttaacctg acggatatgc cagagatccc cgaaaatcac ctcaaagaac ctttggaaca 1980 

atcaccaacc aacatacaca ctacactcaa agatcacatg gatccttatt gggccttgga 2040 

aaacagggat gaagcacatt cctaacctgc ttcctaatgg ggatgcttcg ccagccaggt 2100 

cctcacctgt gtgtacacca gcaggacact gatccagtca cagccataca gctgtccaca 2160 

ctgaagaacg tgtcctacaa cagcctgaat caaatggtta gcttaataga taaaaatccc 222 0 

agactacttc agcctttaat gccttttatt cataaaaact gtgaaagcta gactgaacca 2280 

ttggaaacat ttaactcaga ctctggattc agagtcggga acccttagtt ctatctgaat 2340 

ccaagacagc cacaccttag tatactgccc aaactaatga gtttaataaa tacaaatact 2400 

cgttaaaaaa aaaaaaaaaa aaaaaaaaa 242 9 



<210> 5 
<211> 435 
<212> PRT 

<213> Homo Sapiens 
<400> 5 

Met Thr lie Thr Tyr Thr Ser Gin Val Ala Asn Ala Arg Leu Gly Ser 
15 10 15 
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Phe 


Ser 


Arg 


Leu 








20 


Tyr 


Gly 


Glu 


Phe 






35 




lie 


Tyr 


Arg 


Leu 




50 






Leu 


Thr 


Leu 


Tyr 


65 








Val 


Leu 


Gly 


Phe 


Tyr 


Glu 


Asn 


Leu 








100 


Phe 


Val 


Glu 


Gly 






115 




lie 


Arg 


Tyr 


Ala 




130 






Thr 


Ala 


Val 


Tyr 


145 








Gly 


Phe 


Met 


Thr 


Pro 


His 


Asn 


Met 








180 


Met 


Lys 


Ala 


Trp 






195 




Ser 


Leu 


Leu 


Asn 




210 






Tyr 


Ala 


Tyr 


Asp 


225 








Thr 


Val 


Ala 


Val 


Phe 


Leu 


Asn 


Pro 








260 


Val 


Pro 


Val 


Phe 






275 




Lys 


Val 


Gly 


Leu 




290 






Gly 


His 


Gly 


Gin 


305 








Lys 


Val 


Ser 


Arg 


He 


Pro 


Ala 


Thr 








340 


Arg 


Arg 


Arg 


Leu 






355 




Met 


Met 


Met 


He 




370 






Cys 


Pro 


Cys 


Trp 


385 








Ser 


Arg 


Thr 


Cys 


Leu 


Leu 


Pro 


Pro 








420 


Thr 


Ser 


Ala 








435 





Leu Leu Cys Trp 

Leu I le Phe Leu 
40 

Ala Leu Thr Glu 
55 

Cys Asp Ser Tyr 
70 

Tyr Val Thr Leu 
85 

Pro Trp Pro Asp 

Lys Asp Glu Gin 
120 

Asn Leu Gly Asn 
135 

Lys Arg Phe Pro 
150 

Pro Ala Glu His 
165 

Phe Trp Val Pro 

Leu Gly Gly Arg 
200 

Glu Met Asn Thr 
215 

Trp lie Ser He 
230 

Tyr Ser Phe Phe 
245 

Ala Lys Ala Tyr 

Thr Phe Leu Gin 
280 

Ser Arg Ala Leu 
295 

Gin Leu Leu Glu 
310 

Val Glu Ser Ser 
325 

Arg Glu Ala Glu 

Trp Trp Gin Ser 
360 

Leu Arg Pro Thr 
375 

Leu Trp Met Arg 
390 

Thr Gly He Ser 
405 

Ser Ser Val Glu 



Arg 


Gly 


Ser 


He 


25 








Leu 


Cys 


Tyr 


Tyr 


Glu 


Gin 


Gin 


Leu 








60 


He 


Gin 


Leu 


He 






75 




Val 


Val 


Thr 


Arg 




90 






Arg 


Leu 


Met 


Ser 


105 








Gly 


Arg 


Leu 


Leu 


Val 


Leu 


He 


Leu 








140 


Ser 


Ala 


Gin 


His 






155 




Lys 


Gin 


Leu 


Glu 




170 






Trp 


Val 


Trp 


Phe 


185 








He 


Arg 


Asp 


Pro 


Leu 


Arg 


Thr 


Gin 








2 2 0 


Pro 


Leu 


Val 


Tyr 






2 35 




Leu 


Thr 


Cys 


Leu 




250 






Pro 


Gly 


His 


Glu 


2 65 








Phe 


Phe 


Phe 


Tyr 


Leu 


Gly 


Trp 


Arg 








300 


Thr 


Arg 


Met 


Gin 






T1 C 




Gin 


Ala 


Trp 


Trp 




330 






Ala 


Gly 


Glu 


Ser 


345 








Ser 


Ser 


Ser 


Thr 


Gly 


Leu 


Ser 


Thr 








380 


Cys 


Thr 


Arg 


Thr 






395 




Pro 


Ser 


His 


Ser 




410 






Pro 


Pro 


Leu 


Trp 



425 



Tyr 


Lys 


Leu 


Leu 




30 






He 


He 


Arg 


Phe 


45 








Met 


Phe 


Glu 


Lys 


Pro 


He 


Ser 


Pne 








80 


Trp 


Trp 


Asn 


Gin 






95 




Leu 


Val 


Ser 


Gly 




110 






Arg 


Arg 


Thr 


Leu 


125 








Arg 


Ser 


Val 


Ser 


Leu 


Val 


Gin 


Ala 








16 0 


Lys 


Leu 


Ser 


Leu 






1 / b 




Ala 


Asn 


Leu 


Ser 




190 






He 


Leu 


Leu 


Gin 


2 05 








Cys 


Gly 


His 


Leu 


Thr 


Gin 


Val 


Val 








2 4 0 


Val 


Gly 


Arg 


Gin 






2 55 




Leu 


Asp 


Leu 


Val 




270 






Val 


Gly 


Trp 


Leu 


285 








His 


Gly 


Gin 


Arg 


Cys 


Gin 


Glu 


Arg 








32 0 


Arg 


Thr 


Pro 


Val 






335 




Leu 


Glu 


Pro 


Gly 




350 






Pro 


Leu 


Glu 


Arg 


365 








Gly 


He 


Cys 


Arg 


Cys 


Leu 


Gly 


Trp 








400 


Pro 


Pro 


Thr 


Gin 






415 




Ala 


Pro 


Pro 


Ser 



430 



<210> 6 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<400> 6 

cagggagtcc caccagcc 
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<210> 7 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<400> 7 

tccccattag gaagcagg 

<210> 8 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<400> 8 

tctcctcttt gttcaggc 

<210> 9 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<400> 9 

ctagtcgcca gaccttctgt g 

<210> 10 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 10 

cttgtagact gcggtgctga 

<210> 11 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 11 

gaaagcaagg acgagcaaag 

<210> 12 

<211> 22 

<212> DNA 

<213> Homo Sapiens 

<400> 12 

aatccagtcg taggcataca gg 

<210> 13 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<400> 13 

accttgcgta ctcagtgtgg a 

<210> 14 

<211> 21 

<212> DNA 

<213> Homo Sapiens 
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<400> 14 

tgtcgacaat ccagttggtc t 

<210> 15 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 15 

ccctttggag aggatgatga 

<210> 16 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 16 

ctctggcata tccgtcaggt 

<210> 17 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 17 

cttcaagtct gccccactgt 

<210> 18 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 18 

gcatccccat taggaagcag 

<210> 19 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 19 

ctaagcgggc attagccact 

<210> 20 

<211> 22 

<212> DNA 

<213> Homo Sapiens 

<400> 20 

tggggttcca ggtgggtccg at 

<210> 21 

<211> 27 

<212> DNA 

<213> Homo Sapiens 

<400> 21 

ccatcctaat acgactcact atagggc 

<210> 22 
<211> 27 
<212> DNA 



21 



20 



20 



20 



20 



20 



22 



27 
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<213> Homo Sapiens 
<400> 22 

ggatgaagca cattcctaac ctgcttc 27 

<210> 23 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<400> 23 

aaagctggag gagccgag 18 

<210> 24 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<400> 24 

ctccacccat cttccgttc 19 

<210> 25 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<400> 25 

taggctcaga gcaagggaag 2 0 

<210> 26 

<211> 20 

<212> DNA 

<213> Mus Musculus 

<400> 26 

acacaacaca ttctgggtgc 20 

<210> 27 

<211> 20 

<212> DNA 

<213> Mus Musculus 

<400> 27 

ttcagaaact gcttcccgat 20 

<210> 28 

<211> 1916 

<212> DNA 

<213> Mus Musculus 

<400> 28 

gtgccaagcc atgactatca cctacacaaa caaagtagcc aatgcccgcc tcggttcgtt 60 

ctcgtccctc ctcctgtgct ggcgaggcag catctacaag ctgctgtatg gagaattcct 120 

tgtcttcata ttcctctact attccatccg tggactctac agaatggttc tctcgagtga 180 

tcagcagctg ttgtttgaga agctggctct gtactgcgac agctacattc agctcatccc 240 

tatatccttc gttctgggtt tctatgttac attggtggtg agccgctggt ggagccagta 3 00 

cgagaacttg ccgtggcccg accgcctcat gatccaggtg tctagcttcg tggagggcaa 360 

ggatgaggaa ggccgtttgc tgcggcgcac gctcatccgc tacgccatcc tgggccaagt 42 0 

gctcatcctg cgcagcatca gcacctcggt ctacaagcgc tttcccactc ttcaccacct 480 

ggtgctagca ggttttatga cccatgggga acataagcag ttgcagaagt tgggcctacc 540 

acacaacaca ttctgggtgc cctgggtgtg gtttgccaac ttgtcaatga aggcctatct 600 

tggaggtcga atccgggaca ccgtcctgct ccagagcctg atgaatgagg tgtgtacttt 660 
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gcgtactcag tgtggacagc tgtatgccta cgactggata agtatcccat tggtgtacac 72 0 

acaggtggtg acagtggcag tatacagctt tttccttgca tgcttgatcg ggaggcagtt 780 

tctgaaccca aacaaggact acccaggcca tgagatggat ctggttgtgc ctgtcttcac 840 

aatcctgcaa ttcttattct acatgggctg gctgaaggtg gcagaacagc tcatcaaccc 900 

cttcggggag gacgatgatg attttgagac taactggatc attgacagaa acctgcaggt 960 

gtccctgttg tccgtggatg ggatgcacca gaacttgcct cccatggaac gtgacatgta 1020 

ctggaacgag gcagcgcctc agccgcccta cacagctgct tctgccaggt ctcgccggca 1080 

ttccttcatg ggctccacct tcaacatcag cctaaagaaa gaagacttag agctttggtc 1140 

aaaagaggag gctgacacgg ataagaaaga gagtggctat agcagcacca taggctgctt 12 0 0 

cttaggactg caacccaaaa actaccatct tcccttgaaa gacttaaaga ccaaactatt 1260 

gtgttctaag aaccccctcc tcgaaggcca gtgtaaggat gccaaccaga aaaaccagaa 1320 

agatgtctgg aaatttaagg gtctggactt cttgaaatgt gttccaaggt ttaagaggag 1380 

aggctcccat tgtggcccac aggcacccag cagccaccct actgagcagt cagcaccctc 1440 

cagttcagac acaggtgatg ggccttccac agattaccaa gaaatctgtc acatgaaaaa 1500 

gaaaactgtg gagtttaact tgaacattcc agagagcccc acagaacatc ttcaacagcg 1560 

ccgtttggac cagatgtcaa ccaatataca ggctctaatg aaggagcatg cagagtccta 1620 

tccctacagg gatgaagctg gcaccaaacc tgttctctat gagtgatgcc tcacagcctg 1680 

gccctgactt gcaaggatgc ccagcagggc actgacccag tcaaaggcac acaagcagcg 1740 

acacccagga gtgtgttccc acgacagtct agcatgtaac tcagaaccaa gagtacttaa 1800 

tagtcctgcc tgaaaacacc tgtattttac gatctttccc aaactaagga gtttaataaa 1860 

cgtgaatatt cttttaggtg aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 1916 
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10 STATEMENT REGARDING FEDERALLY-SPONSORED R&D 
Not applicable. 



REFERENCE TO MICROFICHE APPENDIX 
Not applicable. 

15 

FIELD OF THE INVENTION 

The present invention is directed to novel human and mouse DNA 
sequences encoding a protein which, when present in mutated form, results in the 
occurrence of Best's Macular Dystrophy. 

20 

BACKGROUND OF THE INVENTION 

Macular dystrophy is a term applied to a heterogeneous group of 
diseases that collectively are the cause of severe visual loss in a large number of 
people. A common characteristic of macular dystrophy is a progressive loss of central 

25 vision resulting from the degeneration of the pigmented epithelium underlying the 
retinal macula. In many forms of macular dystrophy, the end stage of the disease 
results in legal blindness. More than 20 types of macular dystrophy are known: e.g., 
age-related macular dystrophy, Stargardt's disease, atypical vitelliform macular 
dystrophy (VMDI), Usher Syndrome Type IB, autosomal dominant neovascular 

30 inflammatory vitreoretinopathy, familial exudative vitreoretinopathy, and Best's 

macular dystrophy (also known as hereditary macular dystrophy or Best's vitelliform 
macular dystrophy (VMD2)). For a review of the macular dystrophies, see Sullivan & 
Daiger, 1996, Mol. Med. Today 2:380-386. 
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Best's Macular Dystrophy (BMD) is an inherited autosomal dominant 
macular dystrophy of unknown biochemical cause. BMD has an age of onset that can 
range from childhood to after 40. Clinical symptoms include, at early stages, an 
abnormal accumulation of the yellowish material lipofuscin in the retinal pigmented 
5 epithelium (RPE) underlying the macula. This gives rise to a characteristic "egg 

yolk" appearance of the RPE and gradual loss of visual acuity. With increasing age, 
the RPE becomes more and more disorganized, as the lipofuscin accumulations 
disperse and scarring and neovascularization take place. These changes are 
accompanied by further loss of vision. 

10 The pathological features seen in BMD are in many ways similar to the 

features seen in age-related macular dystrophy, the leading cause of blindness in older 
patients in the developed world. Age-related macular dystrophy is an extraordinarily 
difficult disease to study genetically, since by the time patients are diagnosed, their 
parents are usually no longer living and their children are still asymptomatic. Thus, 

15 family studies which have led to the discovery of the genetic basis of many other 
diseases have not been practical for age-related macular dystrophy. As there are 
currently no widely effective treatments for age-related macular dystrophy, it is hoped 
that study of BMD, and in particular the discovery of the underlying genetic cause of 
BMD, will shed light on age-related macular dystrophy as well. 

20 Linkage analysis has established that the gene responsible for BMD 

resides in the pericentric region of chromosome 1 1, at 1 lql3, near the markers 
D11S956, FCER1B, and UGB (Forsman et al., 1992, Clin. Genet. 42:156-159; Hou et 
al., 1996, Human Heredity 46:21 1-220). Recently, the gene responsible for BMD was 
localized to a -1.7 mB PAC contig lying mostly between the markers Dl IS 1765 and 

25 UGB (Cooper et al., 1997, Genomics 41:185-192). Recombination breakpoint 

mapping in a large Swedish pedigree limited the minimum genetic region containing 
the BMD gene to a 980 kb interval flanked by the microsatellite markers Dl 1S4076 
and UGB (Graff et al., 1997, Hum. Genet. 101: 263-279). 

One difficulty in diagnosing BMD is that carriers of the diseased gene 

30 for BMD may be asymptomatic in terms of visual acuity and morphological changes 
of the RPE observable in a routine ophthalmologic examination. There does exist a 
test, the electro-oculographic examination (EOG), which detects differences in 
electrical potential between the cornea and the retina, that can distinguish 
asymptomatic BMD patients from normal individuals. However, the EOG requires 
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specialized, expensive equipment, is difficult to administer, and requires that the 
patient be present at the site of the equipment when the test is performed. It would be 
valuable to have an alternative method of diagnosing asymptomatic carriers of 
mutations in the gene responsible for BMD that is simpler, less expensive, and does 
5 not require the presence of the patient while the test is being performed. For example, 
a diagnostic test that relies on a blood sample from a patient suspected of being an 
asymptomatic carrier of BMD would be ideal. 

SUMMARY OF THE INVENTION 

10 The present invention is directed to novel human and mouse DNA 

sequences that encode the gene CG1CE, which, when mutated, is responsible for 
Best's macular dystrophy. The present invention includes genomic CG1CE DNA as 
well as cDNA that encodes the CG1CE protein. The human genomic CG1CE DNA 
is substantially free from other nucleic acids and has the nucleotide sequence shown 

15 in SEQ.ID.NO.:l. The human cDNA encoding CG1CE protein is substantially free 
from other nucleic acids and has the nucleotide sequence shown in SEQ.ID.NO.:2 or 
SEQ.ID.NO.:4. The mouse cDNA encoding CG1CE protein is substantially free from 
other nucleic acids and has the nucleotide sequence shown in SEQ.ID.NO.:28. Also 
provided is CG1CE protein encoded by the novel DNA sequences. The human 

20 CG1CE protein is substantially free from other proteins and has the amino acid 

sequence shown in SEQ.ID.NO.:3 or SEQ.ID.NO.:5. The mouse CG1CE protein is 
substantially free from other proteins and has the amino acid sequence shown in 
SEQ.ID.NO.:29. Methods of expressing CG1CE protein in recombinant systems are 
provided. Also provided are diagnostic methods that detect carriers of mutant 

25 CG ICE genes. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 A-F shows the genomic DNA sequence of human CG1CE 
(SEQ.ID.NO.: 1). Underlined nucleotides in capitals represent exons. The start ATG 
30 codon in exon 2 and the stop TAA codon in exon 1 1 are shown in bold italics. The 
consensus polyadenylation signal AATAAA in exon 11 is shown in bold. The 
alternatively spliced part of exon 7 is shown in underlined italics. The exact lengths 
of two gaps between exons 1 and 2 and between exons 7 and 8 are unknown; these 
gaps are presented as runs of ten Ns for the sake of convenience. The portion of exon 
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1 1 beginning at position 15,788 represents the 3' untranslated region; 132 base pairs 
downstream of the polyadenylation signal of the CG1CE gene are multiple ESTs, 
representing the 3'- untranslated region of the ferritin heavy chain gene (FTH). FTH 
has been mapped to human chromosome 1 lql3 (Hentze et al, 1986, Proc. Nat. Acad. 
5 Sci. 83: 7226-7230); the FTH gene was later shown to be a part of the smallest 

minimum genetic region containing the BMD gene, as determined by recombination . 
breakpoint mapping in a 12 generation Swedish pedigree (Graff et al ., 1997, Hum. 
Genet. 101: 263-279). 

Figure 2 shows the complete sequence of the short form of human 

10 CG1CE cDNA (SEQ.LD.NO.:2). The ATG start codon is at position 105; the TAA 
stop codon is at position 1,860. 

Figure 3 shows the complete amino acid sequence of the long form of 
human CG1CE protein (SEQ.ID.NO.:3). This long form of the human CG1CE 
protein is produced by translation of the short form of CG1CE cDNA. 

15 Figure 4 shows the complete sequence of the long form of human 

CG1CE cDNA (SEQ.ID.NO.:4). This long form of the human CG1CE cDNA is 
produced when an alternative splice donor site is utilized in intron 7. The ATG start 
codon is at position 105; the TGA stop codon is at position 1410. 

Figure 5 shows the complete amino acid sequence of the short form of 

20 the human CG1CE protein (SEQJD.NO.:5). This short form of the human CG1CE 
orotein is produced by translation of the long form of CG1CE cDNA. 

Figure 6 shows the results of sequencing runs of PCR fragments that 
represent exon 4 and adjacent intronic regions from three individuals from the 
Swedish pedigree SI, two of whom are affected with BMD. From top to bottom, the 

25 runs are: patient Sl-5 (homozygous affected with BMD), sense orientation; patient 
Sl-4 (heteroozygous affected with BMD), sense orientation; patient S 1-3 (normal 
control, unaffected sister of Sl-4), sense orientation; patient Sl-5 (affected with 
BMD), anti-sense orientation; patient Sl-4 (affected with BMD), anti-sense 
orientation; patient S 1-3 (normal control), anti-sense orientation. Reading from left to 

30 right, the mutation shows up at position 31 of the sequence shown in the case of 

patients Sl-5 and Sl-4. The mutation in family SI changes tryptophan to cysteine. 

Figure 7 shows a multiple sequence alignment of human CG1CE 
protein with partial sequences of related proteins from C. elegans. Related proteins 
from C. elegans were identified by BLASTP analysis of non-redundant GenBank 
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database. This figure shows that two amino acids mutated in two different Swedish 
families with BMD (families SI and SL76) are evolutionarily conserved. 15 of 16 
related proteins from C. elegans contain a tryptophan at the position of the mutation 
in family SI, as does the wild-type CG1CE gene. Only one C elegans protein does 
5 not have a tryptophan at the position of the mutation. In this protein (accession 
number p34577), tryptophan is changed for isofunctional phenylalanine 
(phenylalanine is highly similar to tryptophan in that it also is a hydrophobic aromatic 
amino acid). Mutation in the BMD family SL76 changes a tyrosine to histidine. 
Again, all 16 related proteins from C. elegans contain tyrosine or isofunctional 

10 phenylalanine in this position (tyrosine is highly similar to phenylalanine in that it 
also is an aromatic amino acid). 

Figure 8A-C shows the complete sequence of mouse CG1CE cDNA 
(SEQ.ID.NO. :28) and mouse CG1CE protein (SEQ.ID.NO.:29). 

Figure 9A-B shows an alignment of the amino acid sequences of the 

15 long form of human CG1CE protein (SEQ.ID.NO.:3) and mouse CG1CE protein 
(SEQ.TD.NO.:29). In this figure, CG1CE is referred to as "bestrophin." 

Figure 10A-C shows the results of in situ hybridization experiments 
demonstrating that mouse CG1CE mRNA expression is localized to the retinal 
pigmented epithelium cells (RPE). Figure 10A shows the results of using an 

20 antisense CG1CE probe. The antisense probe hybridizes to mouse CG1CE mRNA 
present in the various cell layers of the retina, labeling with dark bands the cells 
containing CG1CE mRNA. The antisense probe strongly hybridized to the RPE cells 
and not to the cells of the other layers of the retina. Figure 10B shows the results 
using a sense CG1CE probe as a control. The sense probe does not hybridize to 

25 CG1CE mRNA and does not label the RPE cells. Figure IOC is a higher 

magnification of the RPE cells from Figure 10A. Human CG1CE mRNA shows a 
similar distribution, being confined to the RPE cells of the human retina. 

DETAILED DESCRIPTION OF THE INVENTION 
30 For the purposes of this invention: 

"Substantially free from other proteins" means at least 90%, preferably 
95%, more preferably 99%, and even more preferably 99.9%, free of other proteins. 
Thus, a CG1CE protein preparation that is substantially free from other proteins will 
contain, as a percent of its total protein, no more than 10%, preferably no more than 
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5%, more preferably no more than 1%, and even more preferably no more than 0.1%, 
of non- CG1CE proteins. Whether a given CG1CE protein preparation is 
substantially free from other proteins can be determined by such conventional 
techniques of assessing protein purity as, e.g., sodium dodecyl sulfate polyacrylamide 
5 gel electrophoresis (SDS-PAGE) combined with appropriate detection methods, e.g., 
silver staining or immunoblotting. 

"Substantially free from other nucleic acids" means at least 90%, 
preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other 
nucleic acids. Thus, a CG1CE DNA preparation that is substantially free from other 

10 nucleic acids will contain, as a percent of its total nucleic acid, no more than 10%, 
preferably no more than 5%, more preferably no more than 1%, and even more 
preferably no more than 0.1%, of non- CG1CE nucleic acids. Whether a given 
CG1CE DNA preparation is substantially free from other nucleic acids can be 
determined by such conventional techniques of assessing nucleic acid purity as, e.g., 

15 agarose gel electrophoresis combined with appropriate staining methods, e.g., 
ethidium bromide staining, or by sequencing. 

A "conservative amino acid substitution" refers to the replacement of 
one amino acid residue by another, chemically similar, amino acid residue. Examples 
of such conservative substitutions are: substitution of one hydrophobic residue 

20 (isoleucine, leucine, valine, or methionine) for another; substitution of one polar 
residue for another polar residue of the same charge (e.g., arginine for lysine; 
glutamic acid for aspartic acid); substitution of one aromatic amino acid (tryptophan, 
tyrosine, or phenylalanine) for another. 

The present invention relates to the identification and cloning of 

25 CG1CE, a gene which, when mutated, is responsible for Best's macular dystrophy. 
That CG1CE is the Best's macular dystrophy gene is supported by various 
observations: 

1. CG1CE maps to the genetically defined region of human 
chromosome 1 Iql2-ql3 that has been shown to contain the Best's macular dystrophy 

30 gene. CG1CE is present on two PAC clones, 759J12 and 466 A 1 1, that lie precisely in 
the most narrowly defined region that has been shown to contain CG1CE (Cooper et 
aL, 1997, Genomics 41:185-192; Stohr etal, 1997, Genome Res. 8:48-56; Graff et al 
., 1997, Hum. Genet. 101: 263-279). 

2. CG1CE is expressed predominately in the retina. 



-6 - 



20177YP 



O itrji irt i{ S n 5n ll 4 «. li ii™ 5 * Ik c j ! ii ^ O 



3. In patients having Best's macular dystrophy, CG1CE contains 
mutations in evolutionarily conserved amino acids. 

4. The CG1CE genomic clones contain another gene (FTH) that 
has been physically associated with the Best's macular dystrophy region (Cooper et 

5 a/., 1997, Genomics 41:185-192; Stohref a/., 1997, Genome Res. 8:48-56; Graff et 
a/., 1997, Hum. Genet. 101:263-279). The FTH and CG1CE genes are oriented tail- 
to-tail; the distance between their polyadenylation signals is 132 bp. 

The present invention provides DNA encoding CG1CE that is 
substantially free from other nucleic acids. The present invention also provides 

10 recombinant DNA molecules encoding CG1CE. The present invention provides 

DNA molecules substantially free from other nucleic acids comprising the nucleotide 
sequence shown in Figure 1 as SEQ.ID.NO.rl. Analysis of SEQ.ED.NO.:l revealed 
that this genomic sequence defines a gene having 1 1 exons. These exons collectively 
have an open reading frame that encodes a protein of 585 amino acids. If an 

15 alternative splice donor site is utilized in exon 7, a cDNA containing an additional 
203 bases is produced. Although longer, this cDNA contains a shorter open reading 
frame of 1,305 bases (due to the presence of a change in reading frame that introduces 
a stop codon) that encodes a protein of 435 amino acids. Thus, the present invention 
includes two cDNA molecules encoding two forms of CG1CE protein that are 

20 substantially free from other nucleic acids and have the nucleotide sequences shown 
in Figure 2 as SEQ.ID.NO.:2 and in Figure 4 as SEQ.ID.NO.:4. 

The present invention includes DNA molecules substantially free from 
other nucleic acids comprising the coding regions of SEQ.ID.NO.:2 and 
SEQ.ID.NO.:4. Accordingly, the present invention includes DNA molecules 

25 substantially free from other nucleic acids having a sequence comprising positions 
105-1,859 of SEQ.ID.NO.:2 and positions 105-1,409 of SEQTD.NO.:4. Also 
included are recombinant DNA molecules having a nucleotide sequence comprising 
positions 105-1,859 of SEQ.ID.NO.:2 and positions 105-1,409 of SEQ.ID.NO.:4. 

Portions of the cDNA sequences of SEQ.ID.NO.:2 and SEQ.ID.NO.:4 

30 are found in two retina-specific ESTs deposited in GenBank by The Institute for 

Genomic Research (accession numbers AA318352 and AA317489). Other ESTSs 
that correspond to this cDNA are accession numbers AA3071 19 (from a colon 
carcinoma), AA205892 (from neuronal cell line), and AA326727 (from human 
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cerebellum). A true mouse ortholog of the CG1CE gene is represented in the mouse 
EST AA497726 (from mouse testis). 

The novel DNA sequences of the present invention encoding CG1CE, 
in whole or in part, can be linked with other DNA sequences, i.e., DNA sequences to 
5 which CG1CE is not naturally linked, to form "recombinant DNA molecules" 

encoding CG1CE. Such other sequences can include DNA sequences that control 
transcription or translation such as, e.g., translation initiation sequences, promoters 
for RNA polymerase II, transcription or translation termination sequences, enhancer 
sequences, sequences that control replication in microorganisms, sequences that 

10 confer antibiotic resistance, or sequences that encode a polypeptide "tag" such as, 
e.g., a polyhistidine tract or the myc epitope. The novel DNA sequences of the 
present invention can be inserted into vectors such as plasmids, cosmids, viral 
vectors, PI artificial chromosomes, or yeast artificial chromosomes. 

Included in the present invention are DNA sequences that hybridize to 

15 at least one of SEQ.ID.NOs.:!, 2, or 4 under stringent conditions. By way of 

example, and not limitation, a procedure using conditions of high stringency is as 
follows: Prehybridization of filters containing DNA is carried out for 2 hr. to 
overnight at 65°C in buffer composed of 6X SSC, 5X Denhardt's solution, and 100 
/ig/ml denatured salmon sperm DNA. Filters are hybridized for 12 to 48 hrs at 65°C 

20 in prehybridization mixture containing 100 /ig/ml denatured salmon sperm DNA and 
5-20 X 10 6 cpm of 32p_i aDe ] e d probe. Washing of filters is done at 37°C for 1 hr in a 
solution containing 2X SSC, 0.1% SDS. This is followed by a wash in 0.1X SSC, 
0.1% SDS at 50°C for 45 min. before autoradiography. 

Other procedures using conditions of high stringency would include 

25 either a hybridization carried out in 5XSSC, 5X Denhardt's solution, 50% formamide 
at 42°C for 12 to 48 hours or a washing step carried out in 0.2X SSPE, 0.2% SDS at 
65°C for 30 to 60 minutes. 

Reagents mentioned in the foregoing procedures for carrying out high 
stringency hybridization are well known in the art. Details of the composition of 

30 these reagents can be found in, e.g., Sambrook, Fritsch, and Maniatis, 1989, 

Molecular Cloning: A Laboratory Manual , second edition, Cold Spring Harbor 
Laboratory Press. In addition to the foregoing, other conditions of high stringency 
which may be used are well known in the art. 
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The degeneracy of the genetic code is such that, for all but two amino 
acids, more than a single codon encodes a particular amino acid. This allows for the 
construction of synthetic DNA that encodes the CG1CE protein where the nucleotide 
sequence of the synthetic DNA differs significantly from the nucleotide sequences of 
5 SEQ.ID.NOs.:2 or 4, but still encodes the same CG1CE protein as SEQ.ID.NOs.:2 or 
4. Such synthetic DNAs are intended to be within the scope of the present invention. 

Mutated forms of SEQ.ID.NOs.: 1, 2, or 4 are intended to be within the 
scope of the present invention. In particular, mutated forms of SEQ.ID.NOs.: 1, 2, or 
4 which give rise to Best's macular dystrophy are within the scope of the present 

10 invention. Accordingly, the present invention includes a DNA molecule having a 
nucleotide sequence that is identical to SEQ.ED.NO.:l except that the nucleotide at 
position 7,259 of SEQ.ID.NO.: 1 is T, A, or C rather than G, so that the codon at 
positions 7,257-7,259 encodes either cysteine or is a stop codon rather than encoding 
tryptophan. Also included in the present invention is a DNA molecule having a 

15 nucleotide sequence that is identical to SEQ.ID.NO.:! except that at least one of the 
nucleotides at position 7,257 or 7,258 has been changed so that the codon at positions 
7,257-7,259 does not encode tryptophan. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that the 

20 nucleotide at position 383 is T, A, or C rather than G, so that the codon at positions 
381-383 encodes either cysteine or is a stop codon rather than encoding tryptophan. 
Also included in the present invention is a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO.:2 except that at least 
one of the nucleotides at position 381 or 382 has been changed so that the codon at 

25 positions 381-383 does not encode tryptophan. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,409 of SEQ.ED.NO.:4 except that the 
nucleotide at position 383 is T, A, or C rather than G, so that the codon at positions 
381-383 encodes either cysteine or is a stop codon rather than encoding tryptophan. 

30 Also included in the present invention is a DNA molecule having a nucleotide 

sequence that is identical to positions 105-1,409 of SEQ.ID.NO. :4 except that at least 
one of the nucleotides at position 381 or 382 has been changed so that the codon at 
positions 381-383 does not encode tryptophan. 
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The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to SEQ.ID.NO. :1 except that the nucleotide at position 
7,233 of SEQ.ID.NO.: 1 is C, A, or G rather than T, so that the codon at positions 
7,233-7,235 does not encode tyrosine. Also included in the present invention is a 
5 DNA molecule having a nucleotide sequence that is identical to SEQ.ID.NO.: 1 except 
that at least one of the nucleotides at position 7,234 or 7,235 has been changed so that 
the codon at positions 7,233-7,235 does not encode tyrosine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that the 

10 nucleotide at position 357 is C, A, or G rather than T, so that the codon at positions 
357-359 does not encode tyrosine. Also included in the present invention is a DNA 
molecule having a nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO. :2 except that at least one of the nucleotides at position 358 or 359 has 
been changed so that the codon at positions 357-359 does not encode tyrosine. 

15 The present invention includes a DNA molecule having a nucleotide 

sequence that is identical to positions 105-1,409 of SEQ.ID.NO. :4 except that the 
nucleotide at position 357 is C, A, or G rather than T, so that the codon at positions 
357-359 does not encode tyrosine. Also included in the present invention is a DNA 
molecule having a nucleotide sequence that is identical to positions 105-1,409 of 

20 SEQ.ID.NO. :4 except that at least one of the nucleotides at position 358 or 359 has 
been changed so that the codon at positions 357-359 does not encode tyrosine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to SEQ.ID.NO.: 1 except that the nucleotide at position 
3,330 is C rather than A. Also included in the present invention is a DNA molecule 

25 having a nucleotide sequence that is identical to SEQ.ID.NO.: 1 except that the 

nucleotide at position 3,330 of SEQ.ID.NO. :1 is G, C, or T rather than A, so that the 
codon at positions 3,330-3,332 does not encode threonine. Also included in the 
present invention is a DNA molecule having a nucleotide sequence that is identical to 
SEQ.ID.NO.: 1 except that at least one of the nucleotides at position 3,330 or 3,331 

30 has been changed so that the codon at positions 3,330-3,332 does not encode 
threonine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that the 
nucleotide at position 120 is C rather than A. Also included in the present invention 
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is a DNA molecule having a nucleotide sequence that is identical to positions 105- 
1,859 of SEQ.ID.NO.:2 except that the nucleotide at position 120 is G, C, or T rather 
than A, so that the codon at positions 120-122 does not encode threonine. Also 
included in the present invention is a DNA molecule having a nucleotide sequence 
5 that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that at least one of the 
nucleotides at position 120 or 121 has been changed so that the codon at positions 
120-122 does not encode threonine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,409 of SEQ.ID.NO. :4 except that the 

10 nucleotide at position 120 is C rather than A. Also included in the present invention 
is a DNA molecule having a nucleotide sequence that is identical to positions 105- 
1,409 of SEQ.ID.NO. :4 except that the nucleotide at position 120 is G, C, or T rather 
than A, so that the codon at positions 120-122 does not encode threonine. Also 
included in the present invention is a DNA molecule having a nucleotide sequence 

15 that is identical to positions 105-1,409 of SEQ.ID:NO.:4 except that at least one of the 
nucleotides at position 120 or 121 has been changed so that the codon at positions 
120-122 does not encode threonine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to SEQ.ID.NO.: 1 except that the nucleotide at position 

20 8,939 is A rather than T. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to SEQ.ID.NO.: 1 except that the 
nucleotide at position 8,939 of SEQ.ID.NO. :1 is A, G, or C, rather than T, so that the 
codon at positions 8,939-8,941 does not encode tyrosine. Also included in the present 
invention is a DNA molecule having a nucleotide sequence that is identical to 

25 SEQ.ID.NO.: 1 except that at least one of the nucleotides at position 8,939-8,941 has 
been changed so that the codon at positions 8,939-8,941 does not encode tyrosine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that the 
nucleotide at position 783 is A rather than T, Also included in the present invention 

30 is a DNA molecule having a nucleotide sequence that is identical to positions 105- 

1,859 of SEQ.ID.NO. :2 except that the nucleotide at position 783 is A, G, or C rather 
than T so that the codon at positions 783-785 does not encode tyrosine. Also included 
in the present invention is a DNA molecule having a nucleotide sequence that is 
identical to positions 105-1,859 of SEQ.ID.NO. :2 except that at least one of the 
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nucleotides at position 783-785 has been changed so that the codon at positions 783- 
785 does not encode tyrosine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,409 of SEQ.ID.NO. :4 except that the 
5 nucleotide at position 783 is A rather than T. Also included in the present invention 
is a DNA molecule having a nucleotide sequence that is identical to positions 105- 
1,409 of SEQ.ID.NO.:4 except that the nucleotide at position 783 is A, G, or C rather 
than T, so that the codon at positions 783-785 does not encode tyrosine. Also 
included in the present invention is a DNA molecule having a nucleotide sequence 

10 that is identical to positions 105-1,409 of SEQ.ID.NO.:4 except that at least one of the 
nucleotides at position 783-785 has been changed so that the codon at positions 783- 
785 does not encode tyrosine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to SEQ.ID.NO.: 1 except that the nucleotide at position 

15 1 1,241 is A rather than G. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to SEQ.ID.NO.: 1 except that the 
nucleotide at position 1 1,241 is A, C, or T, rather than G, so that the codon at 
positions 1 1,240-1 1,242 does not encode glycine. Also included in the present 
invention is a DNA molecule having a nucleotide sequence that is identical to 

20 SEQ.ID.NO.: 1 except that at least one of the nucleotides at position 1 1,240 or 11,241 
has been changed so that the codon at positions 1 1,240-1 1,242 does not encode 
glycine. 

The present invention includes a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that the 

25 nucleotide at position 1,000 is A rather than G. Also included in the present invention 
is a DNA molecule having a nucleotide sequence that is identical to positions 105- 
1,859 of SEQ.ID.NO.:2 except that the nucleotide at position 1,000 is A, C, or T 
rather than G, so that the codon at positions 999-1,001 does not encode glycine. Also 
included in the present invention is a DNA molecule having a nucleotide sequence 

30 that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except that at least one of the 
nucleotides at position 999 or 1,000 has been changed so that the codon at positions 
999-1,001 does not encode glycine. 

Another aspect of the present invention includes host cells that have 
been engineered to contain and/or express DNA sequences encoding CG1CE protein. 
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Such recombinant host cells can be cultured under suitable conditions to produce 
CG1CE protein. An expression vector containing DNA encoding CG1CE protein can 
be used for expression of CG1CE protein in a recombinant host cell. Recombinant 
host cells may be prokaryotic or eukaryotic, including but not limited to, bacteria such 
5 as E. coli, fungal cells such as yeast, mammalian cells including, but not limited to, 
cell lines of human, bovine, porcine, monkey and rodent origin, and insect cells 
including but not limited to Drosophila and silkworm derived cell lines. Cell lines 
derived from mammalian species which are suitable for recombinant expression of 
CG1CE protein and which are commercially available, include but are not limited to, 

10 L cells L-M(TK-) (ATCC CCL 1.3), L cells L-M (ATCC CCL 1.2), 293 (ATCC CRL 
1573), Raji (ATCC CCL 86), CV-1 (ATCC CCL 70), COS-1 (ATCC CRL 1650), 
COS-7 (ATCC CRL 1651), CHO-K1 (ATCC CCL 61), 3T3 (ATCC CCL 92), 
NIH/3T3 (ATCC CRL 1658), HeLa (ATCC CCL 2), C127I (ATCC CRL 1616), BS- 
C-l (ATCC CCL 26) and MRC-5 (ATCC CCL 171). 

15 A variety of mammalian expression vectors can be used to express 

recombinant CG1CE in mammalian cells. Commercially available mammalian 
expression vectors which are suitable include, but are not limited to, pMClneo 
(Stratagene), pSG5 (Stratagene), pcDNAI and pcDNAIamp, pcDNA3, pcDNA3.1, 
pCR3.1 (Invitrogen), EBO-pSV2-neo (ATCC 37593), pBPV-l(8-2) (ATCC 371 10), 

20 pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 37199), pRSVneo 
(ATCC 37198), and pSV2-dhfr (ATCC 37146). Following expression in 
recombinant cells, CG1CE can be purified by conventional techniques to a level that 
is substantially free from other proteins. 

The present invention includes CG1CE protein substantially free from 

25 other proteins. The amino acid sequence of the full-length CG1CE protein is shown 
in Figure 3 as SEQ.ID.NO.:3. Thus, the present invention includes CG1CE protein 
substantially free from other proteins having the amino acid sequence SEQ.ID.NO.:3. 
Also included in the present invention is a CG1CE protein that is produced from an 
alternatively spliced CG1CE mRNA where the protein has the amino acid sequence 

30 shown in Figure 5 as SEQ.ID.NO.:5. 

Mutated forms of CG1CE proteins are intended to be within the scope 
of the present invention. In particular, mutated forms of SEQ.ED.NOs.:3 and 5 that 
give rise to Best's macular dystrophy are within the scope of the present invention. 
Accordingly, the present invention includes a protein having the amino acid sequence 
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shown in Figure 3 as SEQ.ID.NO.:3 except that the amino acid at position 93 is 
cysteine rather than tryptophan. The present invention also includes a protein having 
the amino acid sequence shown in Figure 5 as SEQ.ID.NO.:5 except that the amino 
acid at position 93 is cysteine rather than tryptophan. The present invention includes 
5 a protein having the amino acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except 
that the amino acid at position 93 is not tryptophan. The present invention also 
includes a protein having the amino acid sequence shown in Figure 5 as 
SEQ.ID.NO.:5 except that the amino acid at position 93 is not tryptophan. 

The present invention includes a protein having the amino acid 

10 sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino acid at position 
85 is histidine rather than tyrosine. The present invention also includes a protein 
having the amino acid sequence shown in Figure 5 as SEQ.ID.NO.:5 except that the 
amino acid at position 85 is histidine rather than tyrosine. The present invention 
includes a protein having the amino acid sequence shown in Figure 3 as 

15 SEQ.H).NO.:3 except that the amino acid at position 85 is not tyrosine. The present 
invention also includes a protein having the amino acid sequence shown in Figure 5 
as SEQ.ID.NO.:5 except that the amino acid at position 85 is not tyrosine. 

The present invention includes a protein having the amino acid 
sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino acid at position 6 

20 is proline rather than threonine. The present invention also includes a protein having 
the amino acid sequence shown in Figure 5 as SEQ.ID.NO.:5 except that the amino 
acid at position 6 is proline rather than threonine. The present invention includes a 
protein having the amino acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except 
that the amino acid at position 6 is not threonine. The present invention also includes 

25 a protein having the amino acid sequence shown in Figure 5 as SEQ.DD.NO.:5 except 
that the amino acid at position 6 is not threonine. 

The present invention includes a protein having the amino acid 
sequence shown in Figure 3 as SEQ.K).NO.:3 except that the amino acid at position 
227 is asparagine rather than tyrosine. The present invention also includes a protein 

30 having the amino acid sequence shown in Figure 5 as SEQ.ID.NO.:5 except that the 
amino acid at position 227 is asparagine rather than tyrosine. The present invention 
includes a protein having the amino acid sequence shown in Figure 3 as 
SEQ.ID.NO.:3 except that the amino acid at position 227 is not tyrosine. The present 
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invention also includes a protein having the amino acid sequence shown in Figure 5 
as SEQ.ID.NO.:5 except that the amino acid at position 227 is not tyrosine. 

The present invention includes a protein having the amino acid 
sequence shown in Figure 3 as SEQ.LD.NO.:3 except that the amino acid at position 
5 299 is glutamate rather than glycine. The present invention includes a protein having 
the amino acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino 
acid at position 299 is not glycine. As with many proteins, it is possible to modify 
many of the amino acids of CG1CE and still retain substantially the same biological 
activity as the original protein. Thus, the present invention includes modified CG1CE 

10 proteins which have amino acid deletions, additions, or substitutions but that still 
retain substantially the same biological activity as CG1CE. It is generally accepted 
that single amino acid substitutions do not usually alter the biological activity of a 
protein (see, e.g., Molecular Biology of the Gene , Watson et aL, 1987, Fourth Ed., 
The Benjamin/Cummings Publishing Co., Inc., page 226; and Cunningham & Wells, 

15 1989, Science 244:1081-1085). Accordingly, the present invention includes 

polypeptides where one amino acid substitution has been made in SEQ.ID.NOs.:3 or 
5 wherein the polypeptides still retain substantially the same biological activity as 
CG1CE. The present invention also includes polypeptides where two amino acid 
substitutions have been made in SEQ.ID.NOs.:3 or 5 wherein the polypeptides still 

20 retain substantially the same biological activity as CG1CE. In particular, the present 
invention includes embodiments where the above-described substitutions are 
conservative substitutions. In particular, the present invention includes embodiments 
where the above-described substitutions do not occur in positions where the amino 
acid present in CG1CE is also present in one of the C. elegans proteins whose partial 

25 sequence is shown in Figure 7. 

The CG1CE proteins of the present invention may contain post- 
translational modifications, e.g., covalently linked carbohydrate. 

The present invention also includes chimeric CG1CE proteins. 
Chimeric CG1CE proteins consist of a contiguous polypeptide sequence of at least a 

30 portion of a CG1CE protein fused to a polypeptide sequence of a non- CG1CE 
protein. 

The present invention also includes isolated forms of CG1CE proteins 
and CG1CE DNA. By "isolated CG1CE protein" or "isolated CG1CE DNA" is 
meant CG1CE protein or DNA encoding CG1CE protein that has been isolated from 
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a natural source. Use of the term "isolated" indicates that CG1CE protein or CG1CE 
DNA has been removed from its normal cellular environment. Thus, an isolated 
CG1CE protein may be in a cell-free solution or placed in a different cellular 
environment from that in which it occurs naturally. The term isolated does not imply 
5 that an isolated CG1CE protein is the only protein present, but instead means that an 
isolated CG1CE protein is at least 95% free of non-amino acid material (e.g., nucleic 
acids, lipids, carbohydrates) naturally associated with the CG1CE protein. Thus, a 
CG1CE protein that is expressed in bacteria or even in eukaryotic cells which do not 
naturally (i.e., without human intervention) express it through recombinant means is 
10 an "isolated CG1CE protein." 

A cDNA fragment encoding full-length CG1CE can be isolated from a 
human retinal cell cDNA library by using the polymerase chain reaction (PCR) 
employing suitable primer pairs. Such primer pairs can be selected based upon the 
cDNA sequence for CG1CE shown in Figure 2 as SEQ.ID.NO.:2 or in Figure 4 as 
15 SEQ.ID.NO.:4. Suitable primer pairs would be, e.g.: 

CAGGGAGTCCCACCAGCC (SEQ.ID.NO.:6) and 

TCCCCATTAGGAAGCAGG (SEQ.ID.NO.:7) 

for SEQ.ID.NO.:2; and 

CAGGGAGTCCCACCAGCC (SEQ.ED.NO.:6) and 
20 TCTCCTCTTTGTTC AGGC (SEQ.ID.NO. : 8) 

for SEQ.ID.NO.:4. 

PCR reactions can be carried out with a variety of thermostable 
enzymes including but not limited to AmpliTaq, AmpliTaq Gold, or Vent polymerase. 
For AmpliTaq, reactions can be carried out in 10 mM Tris-Cl, pH 8.3, 2.0 mM 
25 MgCl2, 200 [iM for each dNTP, 50 mM KC1, 0.2 fiM for each primer, 10 ng of DNA 

template, 0.05 units//xl of AmpliTaq. The reactions are heated at 95°C for 3 minutes 
and then cycled 35 times using the cycling parameters of 95°C, 20 seconds, 62°C, 20 
seconds, 72°C, 3 minutes. In addition to these conditions, a variety of suitable PCR 
protocols can be found in PCR Primer, A Laboratory Manual , edited by C.W. 
30 Dieffenbach and G.S. Dveksler, 1995, Cold Spring Harbor Laboratory Press; or PCR 
Protocols: A Guide to Methods and Applications , Michael et al. 9 eds., 1990, 
Academic Press . 

A suitable cDNA library from which a clone encoding CG1CE can be 
isolated would be Human Retina 5'-stretch cDNA library in lambda gtlO or lambda 
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gtl 1 vectors (catalog numbers HL1 143a and HL1 132b, Clontech, Palo Alto, CA). 
The primary clones of such a library can be subdivided into pools with each pool 
containing approximately 20,000 clones and each pool can be amplified separately. 

By this method, a cDNA fragment encoding an open reading frame of 
5 585 amino acids (SEQ.HXNO.:3) or an open reading frame of 435 amino acids 

(SEQ.ID.NO.:5) can be obtained. This cDNA fragment can be cloned into a suitable 
cloning vector or expression vector. For example, the fragment can be cloned into the 
mammalian expression vector pcDN A3. 1 (Invitrogen, San Diego, Ca). CG1CE 
protein can then be produced by transferring an expression vector encoding CG1CE 
10 or portions thereof into a suitable host cell and growing the host cell under 

appropriate conditions. CG1CE protein can then be isolated by methods well known 
in the art. 

As an alternative to the above-described PCR method, a cDNA clone 
encoding CG1CE can be isolated from a cDNA library using as a probe 

15 oligonucleotides specific for CG1CE and methods well known in the art for screening 
cDNA libraries with oligonucleotide probes. Such methods are described in, e.g., 
Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual; Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New York; Glover, D.M. (ed.), 1985, DNA 
Cloning: A Practical Approach, MRL Press, Ltd., Oxford, U.K., Vol. I, II. 

20 Oligonucleotides that are specific for CG1CE and that can be used to screen cDNA 

libraries can be readily designed based upon the cDNA sequence of CG1CE shown in 
Figure 2 as SEQ.ID.NO.:2 or in Figure 4 as SEQ.ID.NO.:4 and can be synthesized by 
methods well-known in the art. 

Genomic clones containing the CG1CE gene can be obtained from 

25 commercially available human PAC or BAC libraries available from Research 
Genetics, Huntsville, AL. PAC clones containing the CG1CE gene {e.g., PAC 
759J12, PAC 466A1 1) are commercially available from Research Genetics, 
Huntsville, AL (Catalog number for individual PAC clones is RPCLC). 
Alternatively, one may prepare genomic libraries, especially in PI artificial 

30 chromosome vectors, from which genomic clones containing the CG1CE can be 

isolated, using probes based upon the CG1CE sequences disclosed herein. Methods 
of preparing such libraries are known in the art (Ioannou et ai, 1994, Nature Genet. 
6:84-89). 
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The novel DNA sequences of the present invention can be used in 
various diagnostic methods relating to Best's macular dystrophy. The present 
invention provides diagnostic methods for determining whether a patient carries a 
mutation in the CG1CE gene that predisposes that patient toward the development of 
5 Best's macular dystrophy. In broad terms, such methods comprise determining the 
DNA sequence of a region of the CG1CE gene from the patient and comparing that 
sequence to the sequence from the corresponding region of the CG1CE gene from a 
normal person, i.e., a person who does not suffer from Best's macular dystrophy. 

Such methods of diagnosis may be carried out in a variety of ways. 
10 For example, one embodiment comprises: 

(a) providing PCR primers from a region of the CG1CE gene 
where it is suspected that a patient harbors a mutation in the CG1CE gene; 

(b) performing PCR on a DNA sample from the patient to produce 
a PCR fragment from the patient; 

15 (c) performing PCR on a control DNA sample having a nucleotide 

sequence selected from the group consisting of SEQ.ID.NOs.:l, 2 and SEQ.ID.NO.:4 

to produce a control PCR fragment; 

(d) determining the nucleotide sequence of the PCR fragment from 

the patient and the nucleotide sequence of the control PCR fragment; 
20 (e) comparing the nucleotide sequence of the PCR fragment from 

the patient to the nucleotide sequence of the control PCR fragment; 

where a difference between the nucleotide sequence of the PCR 

fragment from the patient and the nucleotide sequence of the control PCR fragment 

indicates that the patient has a mutation in the CG1CE gene. 
25 In a particular embodiment, the PCR primers are from the coding 

region of the CG1CE gene, i.e., from the coding region of SEQ.ID.NOs.:!, 2, or 4. 

In a particular embodiment, the DNA sample from the patient is cDNA 

that has been prepared from an RNA sample from the patient. In another 

embodiment, the DNA sample from the patient is genomic DNA. 
30 In a particular embodiment, the nucleotide sequences of the PCR 

fragment from the patient and the control PCR fragment are determined by DNA 

sequencing. 

In a particular embodiment, the nucleotide sequences of the PCR 
fragment from the patient and the control PCR fragment are compared by direct 
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comparison after DNA sequencing. In another embodiment, the comparison is made 
by a process that includes hybridizing the PCR fragment from the patient and the 
control PCR fragment and then using an endonuclease that cleaves at any mismatched 
positions in the hybrid but does not cleave the hybrid if the two fragments match 
5 perfectly. Such an endonuclease is, e.g., SI. In this embodiment, the conversion of 
the PCR fragment from the patient to smaller fragments after endonuclease treatment 
indicates that the patient carries a mutation in the CG1CE gene. In such 
embodiments, it may be advantageous to label (radioactively, enzymatically, 
immunologically, etc.) the PCR fragment from the patient or the control PCR 
10 fragment. 

The present invention provides a method of diagnosing whether a 
patient carries a mutation in the CG1CE gene that comprises: 

(a) obtaining an RNA sample from the patient; 

(b) performing reverse transcription-PCR (RT-PCR) on the RNA 
15 sample using primers that span a region of the coding sequence of the CG1CE gene to 

produce a PCR fragment from the patient where the PCR fragment from the patient 
has a defined length, the length being dependent upon the identity of the primers that 
were used in the RT-PCR; 

(c) hybridizing the PCR fragment to DNA having a sequence 

20 selected from the group consisting of SEQ.ID.NOs.: 1, 2 and SEQ.ID.NO.:4 to form a 
hybrid ; 

(d) treating the hybrid produced in step (c) with an endonuclease 
that cleaves at any mismatched positions in the hybrid but does not cleave the hybrid 
if the two fragments match perfectly; 

25 (e) determining whether the endonuclease cleaved the hybrid by 

determining the length of the PCR fragment from the patient after endonuclease 
treatment where a reduction in the length of the PCR fragment from the patient after 
endonuclease treatment indicates that the patient carries a mutation in the CG1CE 
gene. 

30 The present invention provides a method of diagnosing whether a 

patient carries a mutation in the CG1CE gene that comprises: 

(a) making cDNA from an RNA sample from the patient; 

(b) providing a set of PCR primers based upon SEQ.ID.NO.:2 or 
SEQ.HXNO.:4; 
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(c) performing PCR on the cDNA to produce a PCR fragment 
from the patient; 

(d) determining the nucleotide sequence of the PCR fragment from 

the patient; 

5 (e) comparing the nucleotide sequence of the PCR fragment from 

the patient with the nucleotide sequence of SEQ.ID.NO.:2 or SEQ.ED.NO.:4; 

where a difference between the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID.NO.:2 or 
SEQ.HXNO.:4 indicates that the patient carries a mutation in the CG1CE gene. 
10 The present invention provides a method of diagnosing whether a 

patient carries a mutation in the CGiCE gene that comprises: 

(a) preparing genomic DNA from the patient; 

(b) providing a set of PCR primers based upon SEQ.ID.NO.:l, 
SEQ.IIXNO.:2, or SEQ.ID.NO.:4; 

15 (c) performing PCR on the genomic DNA to produce a PCR 

fragment from the patient; 

(d) determining the nucleotide sequence of the PCR fragment from 

the patient; 

(e) comparing the nucleotide sequence of the PCR fragment from 
20 the patient with the nucleotide sequence of SEQ.ID.NO.:2 or SEQ.ED.NO.:4; 

where a difference between the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID.NO.:2 or 
SEQ.ID.NO.:4 indicates that the patient carries a mutation in the CGICE gene. 

In a particular embodiment, the primers are selected so that they 

25 amplify a portion of SEQ.ID.NOs.:2 or 4 that includes at least one position selected 
from the group consisting of: positions 120, 121, 122, 357, 358, 359, 381, 382, 383, 
783, 784, and 785. In another embodiment, the primers are selected so that they 
amplify a portion of SEQ.DD.NOs.:2 or 4 that includes at least one position selected 
from the group consisting of: positions 384, 385, and 386. In another embodiment, 

30 the primers are selected so that they amplify a portion of SEQ.ID.NO.:2 that includes 
at least one position selected from the group consisting of: positions 999, 1,000, and 
1,001. In another embodiment, the primers are selected so that they amplify a portion 
of SEQ.ID.NOs.:2 or 4 that includes at least one codon that encodes an amino acid 
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present in CG1CE that is also present in the corresponding position in at least one of 
the C. elegans proteins whose partial amino acid sequence is shown in Figure 7. 

In a particular embodiment, the present invention provides a diagnostic 
method for determining whether a person carries a mutation of the CG1CE gene in 
5 which the G at position 383 of SEQ.ED.NO.:2 has been changed to a C. This change 
results in the creation of a Fnu4HI restriction site. By amplifying a PCR fragment 
spanning position 383 of SEQ.ED.NO.:2 from DNA or cDNA prepared from a person, 
digesting the PCR fragment with Fnu4HI, and visualizing the digestion products, e.g., 
by SDS-PAGE, one can easily determine if the person carries the G383C mutation. 

10 For example, one could use the PCR primer pair 5'-CTCCTGCCCAGGCTTCTAC- 
3' (SEQ.ID.NO.:30) and 5'-CTTGCTCTGCCTTGCCTTC-3' (SEQ.ID.NO.:31) to 
amplify a 125 base pair fragment. Heterozygotes for the G383C mutation have three 
Fnu4HI digestion products: 125 bp, 85 bp, and 40 bp; homozygotes have two: 85 bp 
and 40 bp; and wild-type individuals have a single fragment of 125 bp. 

15 In a particular embodiment, the present invention provides a diagnostic 

method for determining whether a person carries a mutation of the CG1CE gene in 
which the T at position 783 of SEQ.ID.NO.:2 has been changed to an A. This change 
results in the creation of a PflMI restriction site. By amplifying a PCR fragment 
spanning position 783 of SEQ.ID.NO.:2 from DNA or cDNA prepared from a person, 

20 digesting the PCR fragment with PflMI, and visualizing the digestion products, e.g., 
by SDS-PAGE, one can easily determine if the person carries the T783A mutation. 

The present invention also provides oligonucleotide probes, based 
upon the sequences of SEQ.ID.NOs.: 1, 2, or 4, that can be used in diagnostic methods 
related to Best's macular dystrophy. In particular, the present invention includes 

25 DNA oligonucleotides comprising at least 18 contiguous nucleotides of at least one of 
a sequence selected from the group consisting of: SEQ.ID.NOs.: 1, 2 and 
SEQ.ED.:N0.4. Also provided by the present invention are corresponding RNA 
oligonucleotides. The DNA or RNA oligonucleotide probes can be packaged in kits. 

In addition to the diagnostic utilities described above, the present 

30 invention makes possible the recombinant expression of the CG1CE protein in 

various cell types. Such recombinant expression makes possible the study of this 
protein so that its biochemical activity and its role in Best's macular dystrophy can be 
elucidated. 
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The present invention also makes possible the development of assays 
which measure the biological activity of the CG1CE protein. Such assays using 
recombinantly expressed CG1CE protein are especially of interest. Assays for 
CG1CE protein activity can be used to screen libraries of compounds or other sources 
5 of compounds to identify compounds that are activators or inhibitors of the activity of 
CG1CE protein. Such identified compounds can serve as "leads" for the development 
of pharmaceuticals that can be used to treat patients having Best's macular dystrophy. 
In versions of the above-described assays, mutant CG1CE proteins are used and 
inhibitors or activators of the activity of the mutant CG1CE proteins are discovered. 
10 Such assays comprise: 

(a) recombinantly expressing CG1CE protein or mutant CG1CE 
protein in a host cell; 

(b) measuring the biological activity of CG1CE protein or mutant 
CG1CE protein in the presence and in the absence of a substance suspected of being 

15 an activator or an inhibitor of CG1CE protein or mutant CG1CE protein; 

where a change in the biological activity of the CG1CE protein or the 
mutant CG1CE protein in the presence as compared to the absence of the substance 
indicates that the substance is an activator or an inhibitor of CG1CE protein or mutant 
CG1CE protein. 

20 The present invention also includes antibodies to the CG1CE protein. 

Such antibodies may be polyclonal antibodies or monoclonal antibodies. The 
antibodies of the present invention are raised against the entire CG1CE protein or 
against suitable antigenic fragments of the protein that are coupled to suitable carriers, 
e.g., serum albumin or keyhole limpet hemocyanin, by methods well known in the art. 

25 Methods of identifying suitable antigenic fragments of a protein are known in the art. 
See, e.g., Hopp & Woods, 1981, Proc. Natl. Acad. Sci. USA 78:3824-3828; and 
Jameson & Wolf, 1988, CABIOS (Computer Applications in the Biosciences) 4:181- 
186. 

For the production of polyclonal antibodies, CG1CE protein or an 
30 antigenic fragment, coupled to a suitable carrier, is injected on a periodic basis into an 
appropriate non-human host animal such as, e.g., rabbits, sheep, goats, rats, mice. 
The animals are bled periodically and sera obtained are tested for the presence of 
antibodies to the injected antigen. The injections can be intramuscular, 
intraperitoneal, subcutaneous, and the like, and can be accompanied with adjuvant. 
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For the production of monoclonal antibodies, CG1CE protein or an 
antigenic fragment, coupled to a suitable carrier, is injected into an appropriate non- 
human host animal as above for the production of polyclonal antibodies. In the case 
of monoclonal antibodies, the animal is generally a mouse. The animal's spleen cells 
5 are then immortalized, often by fusion with a myeloma cell, as described in Kohler & 
Milstein, 1975, Nature 256:495-497. For a fuller description of the production of 
monoclonal antibodies, see Antibodies: A Laboratory Manual , Harlow & Lane, eds., 
Cold Spring Harbor Laboratory Press, 1988. 

Gene therapy may be used to introduce CG1CE polypeptides into the 

10 cells of target organs, e.g., the pigmented epithelium of the retina or other parts of the 
retina. Nucleotides encoding CG1CE polypeptides can be ligated into viral vectors 
which mediate transfer of the nucleotides by infection of recipient cells. Suitable 
viral vectors include retrovirus, adenovirus, adeno-associated virus, herpes virus, 
vaccinia virus, and polio virus based vectors. Alternatively, nucleotides encoding 

15 CG1CE polypeptides can be transferred into cells for gene therapy by non-viral 
techniques including receptor-mediated targeted transfer using ligand-nucleotide 
conjugates, lipofection, membrane fusion, or direct microinjection. These procedures 
and variations thereof are suitable for ex vivo as well as in vivo gene therapy. Gene 
therapy with CG1CE polypeptides will be particularly useful for the treatment of 

20 diseases where it is beneficial to elevate CG1CE activity. 

The present invention includes DNA comprising nucleotides encoding 
mouse CG1CE. Included within such DNA is the DNA sequence shown in Figure 
8A-C (SEQ. ID. NO.:28). Also included is DNA comprising positions 11-1,663 of 
SEQ. ID. NO.:28. Also included are mutant versions of DNA encoding mouse 

25 CG1CE. Included is DNA comprising nucleotides that are identical to positions 1 1- 
1,663 of SEQ. ID. NO.:28 except that at least one of the nucleotides at positions 26- 
28, positions 263-265, positions 287-289, positions 689-691, and/or positions 905- 
907 differs from the corresponding nucleotide at positions 26-28, positions 263-265, 
positions 287-289, positions 689-691, and/or positions 905-907 of SEQ. ID. NO.:28. 

30 Particularly preferred versions of mutant DNAs are those in which the nucleotide 
change results in a change in the corresponding encoded amino acid. The DNA 
encoding mouse CG1CE can be in isolated form, can be substantially free from other 
nucleic acids, and/or can be recombinant DNA. 
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The present invention includes mouse CG1CE protein (SEQ. ED. 
NO.:29). This mouse CG1CE protein can be in isolated form and/or can be 
sustantially free from other proteins. Mutant versions of mouse CG1CE protein are 
also part of the present invention. Examples of such mutant mouse CG1CE proteins 
5 are proteins that are identical to SEQ. ID. NO.:29 except that the amino acid at 

position 6, position 85, position 93, position 227, and/or position 299 differs from the 
corresponding amino acid at position 6, position 85, position 93, position 227, and/or 
position 299 in SEQ. ID. NO.:29. 

cDNA encoding mouse CG1CE can be amplified by PCR from cDNA 

10 libraries made from mouse eye or mouse testis. Suitable primers can be readily 
designed based upon SEQ. ED. NO.:28. Alternatively, cDNA encoding mouse 
CG1CE can be isolated from cDNA libraries made from mouse eye or mouse testis by 
the use of oligonucleotide probes based upon SEQ. ID. NO.:28. 

In situ hybridization studies demonstrated that mouse CG1CE is 

15 specifically expressed in the retinal pigmented epithelium (see Figure 10). 

By providing DNA encoding mouse CG1CE, the present invention 
allows for the generation of an animal model of Best's macular dystrophy. This 
animal model can be generated by making "knockout" or "knockin" mice containing 
altered CG1CE genes. Knockout mice can be generated in which portions of the 

20 mouse CG1CE gene have been deleted. Knockin mice can be generated in which 

mutations that have been shown to lead to Best's macular dystrophy when present in 
the human CG1CE gene are introduced into the mouse gene. In particular, mutations 
resulting in changes in amino acids 6, 85, 93, 227, or 299 of the mouse CG1CE 
protein (SEQ.ID.NO.:29) are contemplated. Such knockout and knockin mice will be 

25 valuable tools in the study of the Best's macular dystrophy disease process and will 
provide important model systems in which to test potential pharmaceuticals or 
treatments for Best's macular dystrophy. 

Methods of producing knockout and knockin mice are well known in 
the art. For example, the use of gene-targeted ES cells in the generation of gene- 

30 targeted transgenic knockout mice is described in, e.g., Thomas et al., 1987, Cell 
51:503-512, and is reviewed elsewhere (Frohman et al., 1989, Cell 56:145-147; 
Capecchi, 1989, Trends in Genet. 5:70-76; Baribault et al., 1989, Mol. Biol. Med. 
6:481-492). 
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Techniques are available to inactivate or alter any genetic region to 
virtually any mutation desired by using targeted homologous recombination to insert 
specific changes into chromosomal genes. Generally, use is made of a "targeting 
vector," i.e. y a plasmid containing part of the genetic region it is desired to mutate. By 
5 virtue of the homology between this part of the genetic region on the plasmid and the 
corresponding genetic region on the chromosome, homologous recombination can be 
used to insert the plasmid into the genetic region, thus disrupting the genetic region. 
Usually, the targeting vector contains a selectable marker gene as well. 

In comparison with homologous extrachromosomal recombination, 

10 which occurs at frequencies approaching 100%, homologous plasmid-chromosome 

recombination was originally reported to only be detected at frequencies between 10-6 
and 10-3 (Lin et ah, 1985, Proc. Natl. Acad. Sci. USA 82:1391-1395; Smithies et al., 
1985, Nature 317: 230-234; Thomas et al., 1986, Cell 44:419-428). 
Nonhomologous plasmid-chromosome interactions are more frequent, occurring at 

15 levels 105-fold (Lin et al., 1985, Proc. Natl. Acad. Sci. USA 82:1391-1395) to 102- 
fold (Thomas et al., 1986, Cell 44:419-428) greater than comparable homologous 
insertion. 

To overcome this low proportion of targeted recombination in murine 
ES cells, various strategies have been developed to detect or select rare homologous 

20 recombinants. One approach for detecting homologous alteration events uses the 
polymerase chain reaction (PCR) to screen pools of transformant cells for 
homologous insertion, followed by screening individual clones (Kim et al., 1988, 
Nucleic Acids Res. 16:8887-8903; Kim et al., 1991, Gene 103:227-233). 
Alternatively, a positive genetic selection approach has been developed in which a 

25 marker gene is constructed which will only be active if homologous insertion occurs, 
allowing these recombinants to be selected directly (Sedivy et al., 1989, Proc. Natl. 
Acad. Sci. USA 86:227-231). One of the most powerful approaches developed for 
selecting homologous recombinants is the positive-negative selection (PNS) method 
developed for genes for which no direct selection of the alteration exists (Mansour et 

30 al., 1988, Nature 336:348-352; Capecchi, 1989, Science 244:1288-1292; Capecchi, 
1989, Trends in Genet. 5:70-76). The PNS method is more efficient for targeting 
genes which are not expressed at high levels because the marker gene has its own 
promoter. Nonhomologous recombinants are selected against by using the Herpes 
Simplex virus thymidine kinase (HSV-TK) gene and selecting against its 
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nonhomologous insertion with herpes drugs such as gancyclovir (GANC) or FIAU (1- 
(2-deoxy 2-fluoro-B-D-arabinofluranosyl)-5-iodouracil). By this counter-selection, 
the percentage of homologous recombinants in the surviving transformants can be 
increased. 

5 

The following non-limiting examples are presented to better illustrate 

the invention. 

EXAMPLE 1 

10 Identification of the human CG1CE gene and cDNA cloning 

Construction of Libraries for Shotgun Sequencing 

Bacterial strains containing the BMD PACs (PI Artificial 
Chromosomes) were received from Research Genetics (Huntsville, AL). The 
minimum tiling path between markers D11S4076 and UGB that represents the 

15 minimum genetic region containing the BMD gene includes the following nine PAC 
clones: 363M5 (140 kb), 519013(120 kb), 527E4 (150 kb), 688P12 (140 kb), 
741N15 (170 kb), 756B9 (120 kb), 759J12 (140 kb), 1079D9 (170 kb), and 363P2 
(160 kb). Cells were streaked on Luria-Bertani (LB) agar plates supplemented with 
the appropriate antibiotic. A single colony was picked up and subjected to colony- 

20 PCR analysis with corresponding STS primers described in Cooper et aL, 1997, 

Genomics 41:185-192 to confirm the authenticity of PAC clones. A single positive 
colony was used to prepare a 5-ml starter culture and then 1-L overnight culture in LB 
medium. The cells were pelleted by centrifugation and PAC DNA was purified by 
equilibrium centrifugation in cesium chloride-ethidium bromide gradient (Sambrook, 

25 Fritsch, and Maniatis, 1989, Molecular Cloning: A Laboratory Manual , second 

edition, Cold Spring Harbor Laboratory Press). Purified PAC DNA was brought to 
50 mM Tris pH 8.0, 15 mM MgCl2, and 25% glycerol in a volume of 2 ml and placed 

in a AERO-MIST nebulizer (CIS-US, Bedford, MA). The nebulizer was attached to a 
nitrogen gas source and the DNA was randomly sheared at 10 psi for 30 sec. The 
30 sheared DNA was ethanol precipitated and resuspended in TE (10 mM Tris, 1 mM 
EDTA). The ends were made blunt by treatment with Mung Bean Nuclease 
(Promega, Madison, WI) at 30°C for 30 min, followed by phenol/chloroform 
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extraction, and treatment with T4 DNA polymerase (GIBCO/BRL, Gaithersburg, 
MD) in multicore buffer (Promega, Madison, WI) in the presence of 40 uM dNTPs at 
16°C. To facilitate subcloning of the DNA fragments, BstX I adapters (Invitrogen, 
Carlsbad, CA) were ligated to the fragments at 14°C overnight with T4 DNA ligase 
5 (Promega, Madison, WI). Adapters and DNA fragments less than 500 bp were 
removed by column chromatography using a cDNA sizing column (GIBCO/BRL, 
Gaithersburg, MD) according to the instructions provided by the manufacturer. 
Fractions containing DNA greater than 1 kb were pooled and concentrated by ethanol 
precipitation. The DNA fragments containing BstX I adapters were ligated into the 

10 BstX I sites of pSHOT II which was constructed by subcloning the BstX I sites from 
pcDNA II (Invitrogen, Carlsbad, CA) into the BssH II sites of pBlueScript 
(Stratagene, La Jolla, CA). pSHOT II was prepared by digestion with BstX I 
restriction endonuclease and purified by agarose gel electrophoresis. The gel purified 
vector DNA was extracted from the agarose by following the Prep-A-Gene (BioRad, 

15 Richmond, CA) protocol. To reduce ligation of the vector to itself, the digested 

vector was treated with calf intestinal phosphatase (GIBCO/BRL, Gaithersburg, MD. 
Ligation reactions of the DNA fragments with the cloning vector were transformed 
into ultra-competent XL-2 Blue cells (Stratagene, La Jolla, CA), and plated on LB 
agar plates supplemented with 100 |Hg/ml ampicillin. Individual colonies were picked 

20 into a 96 well plate containing 100 ^il/well of LB broth supplemented with ampicillin 
and grown overnight at 37°C. Approximately 25 (xl of 80% sterile glycerol was added 
to each well and the cultures stored at -80°C. 

Preparation of plasmid DNA 

25 Glycerol stocks were used to inoculate 5 ml of LB broth supplemented 

with 100 |ig/ml ampicillin either manually or by using a Tecan Genesis RSP 150 
robot (Tecan AG, Hombrechtikon, Switzerland) programmed to inoculate 96 tubes 
containing 5 ml broth from the 96 wells. The cultures were grown overnight at 37°C 
with shaking to provide aeration. Bacterial cells were pelleted by centrifugation , the 

30 supernatant decanted, and the cell pellet stored at -20°C. Plasmid DNA was prepared 
with a QIAGEN Bio Robot 9600 (QIAGEN, Chatsworth, CA) according to the 
Qiawell Ultra protocol. To test the frequency and size of inserts, plasmid DNA was 
digested with the restriction endonuclease Pvu II. The size of the restriction 
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endonuclease products was examined by agarose gel electrophoresis with the average 
insert size being 1 to 2 kb. 

DNA Sequence Analysis of Shotgun clones 
5 DNA sequence analysis was performed using the ABI PRISM™ dye 

terminator cycle sequencing ready reaction kit with AmpliTaq DNA polymerase, FS 
(Perkin Elmer, Norwalk, CT). DNA sequence analysis was performed with M13 
forward and reverse primers. Following amplification in a Perkin-Elmer 9600, the 
extension products were purified and analyzed on an ABI PRISM 377 automated 
10 sequencer (Perkin Elmer, Norwalk, CT). Approximately 4 sequencing reactions were 
performed per kb of DNA to be examined (384 sequencing reactions per each of nine 
PACs). 

Assembly of DNA sequences 

15 Phred/Phrap was used for DNA sequences assembly. This program 

was developed by Dr. Phil Green and licensed from the University of Washington 
(Seattle, WA). Phred/Phrap consists of the following programs: Phred for base- 
calling, Phrap for sequence assembly, Crossmatch for sequence comparisons, Consed 
and Phrapview for visualization of data, Repeatmasker for screening repetitive 

20 sequences. Vector and E. coli DNA sequences were identified by Crossmatch and 

removed from the DNA sequence assembly process. DNA sequence assembly was on 
a SUN Enterprise 4000 server running a Solaris 2.51 operating system (Sun 
Microsystems Inc., Mountain View, CA) using default Phrap parameters. The 
sequence assemblies were further analyzed using Consed and Phrapview. 

25 

Identification of new microsatellite genetic markers from the Best's macular 
dystrophy region 

Isolation of CA micro satellites from PAC-specific sublibraries, 
Southern blotting and hybridization of PAC DNA with a (dC-dA) n (dG-dT) n probe 
30 (Pharmacia Biotech, Uppsala, Sweden) was used to confirm the presence of CA 
repeats in nine PAC clones that represent a minimum tiling path. Shotgun PAC- 
specific sublibraries were constructed from DNA of all 9 PAC clones using a protocol 
described above. The sublibraries were plated on agar plates, and colonies were 
transfered to nylon membranes and probed with randomly primed polynucleotide, 
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(dC-dA) n (dG-dT)n, Hybridization was performed overnight in a solution containing 
6X SSC, 20 mM sodium phosphate buffer (pH 7.0), 1% bovine serum albumin, and 
0,2% sodium dodecyl sulfate at 65°C. Filters were washed four times for 15 min each 
in 2X SSC and 0.2% SDS at 65°C. CA-positive subclones were identified for all but 
5 one PAC clone (527E4). DNA from these subclones was isolated and sequenced as 
descrobed above for the shotgun library clones. 

Identification of simple repeat sequences in assembled DNA 
sequences. DNA sequence at the final stage of assembly was checked for the 
presence of microsatellite repeats using a Consed visualization tool of the 
10 Phred/Phrap package. 

Polymorphism analysis and recombination mapping 

Sequence fragments containing CA repeats were analyzed using the 
PRIMER program; oligonucleotide pairs flanking each of the CA repeats were 
15 synthesized. The forward primer was kinase-labeled with [gamma-32p]_ATP. 
Amplification of the genomic DNA was peformed in a total volume of 10 ix\ 
containing 5 ng/fxl of genomic DNA; 10 mM Tris-HCl pH 8.3; 1.5 mM MgCl2 ; 50 

mMKCl; 0.01% gelatin; 200 mM dNTPs; 0.2 pmol/Ml of both primers; 0.025 unit/Ml 
of Taq polymerase. The PCR program consisted of 94° C for 3 min followed by 30 

20 cycles of 94°C for 1 min, 55°C for 2 min, 72°C for 2 min and a final elongation step 
at 72°C for 10 min. Following amplification, samples were mixed with 2 vol of a 
formamide dye solution and run on a 6% pol y aery 1 amide sequencing gel. Two newly 
identified markers detected two recombination events in disease chromosomes of 
individuals from family SI. This limited the minimum genetic region to the interval 

25 covered by 6 PAC clones: 519013, 759J12, 756B9, 363M5, 363P2, and 741N15. 

Identification of the retina-specific EST hit in the pCA759112-2 clone. 

A CA-positive subclone (pCA759J12-2) was identified in the shotgun 
library generated from the PAC 759jl2 DNA by hybridization to the (dC-dA) n (dG- 
30 dT) n probe. DNA sequence from pCA759J12-2 was queried against the EST 

sequences in the GenBank database using the BLAST algorithm (S.F. Altschul, et al. y 
1990, J. Mol. Biol. 215:403-410). The BLAST analysis identified a high degree of 
similarity between the DNA sequence obtained from the clone pCA759J12-2 and a 
retina-specific human EST with GenBank accession number A A3 18352. BLASTX 



- 29 - 



ft iS i? 11 ift h 4- ,> 1 5? 1 iP fl U 

20177YP 



10 



analysis of EST AA3 18352 revealed a strong homology of the corresponding protein 
to a group of C elegans proteins with unknown function (RFP family). The RFP 
family is known only from C. elegans genome and EST sequences (e.g., C. elegans 
C29F4.2 and B0564.3) and is named for the amino acid sequence RFP that is 
invariant among 15 of the 16 family members; members share a conserved 300-400 
amino acid sequence including 25 highly conserved aromatic residues. 

A human gene partially represented in pCA759J12-2 and EST 
AA3 18352 was dubbed CG1CE (Candidate Gene #1 with the homology to the C. 
elegans group of genes) and selected for detaled analysis. 



Biolnformatic Analysis of Assembled DNA Sequences 

When the assembled DNA sequences from the nine BMD PACs 
approached 0.5-1-fold coverage, the DNA contigs were randomly concatenated, and 
prediction abilities of the program package AceDB were utilized to aid in gene 
15 identification. 

In addition to the DNA sequence generated from the nine PACs 
mentioned above, Genbank database entries for PACs 466 A 1 1 and 363P2 (GeneBank 
accession numbers AC003025 and AC003023, respectively) were analyzed with the 
use of the same AceDB package. PAC clones 466 A 1 1 and 363P2 represent parts of 

20 the PAC contig across the BMD region (Cooper et aL, 1997, Genomics 41:185-192); 
both clones map to the minimum genetic region containing the BMD gene that was 
determined by recombination breakpoint analysis in a 12-generation Swedish pedigree 
(Graff et al ., 1997, Hum. Genet. 101 : 263-279). Datbase entries for PACs 466A1 1 
and 363P2 represent unordered DNA pieces genereated in Phase 1 High Throughput 

25 Genome Sequence Project (HTGS phase 1) by Genome Science and Technology 
Center, University of Texas Southwestern Medical Center at Dallas. 

cDNA sequence and exon/intron organization of the CG1CE gene 

Genomic DNA sequences from PACs 466A1 1 and 759J12 were 
30 compared with the CG1CE cDNA sequence from EST A A3 1 8352 using the program 
Crossmatch which allowed for a rapid and sensitive detection of the location of 
exons. The identification of intron/exon boundaries was then accomplished by 
manually comparing visualized genomic and cDNA sequences by using the AceDB 
package. This analysis allowed the identification of exons 8, 9, and 10 that are 
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represented in EST AA3 18352. To increase the accuracy of the analysis, the DNA 
sequence of EST AA318352 was verified by comparison with genomic sequence 
obtained from pCA759J12-2, PAC 466A11, and shotgun PAC 759J12 subclones. 
The verified EST AA3 18352 sequence was reanalyzed by BLAST; two new EST's 
5 (accession numbers AA3071 19 and AA205892) were found to partially overlap with 
EST A A3 18352. They were assembled into a contig using the program Sequencher 
(Perkin Elmer, Norwalk, CT), and a consensus sequence derived from three ESTs 
(AA318352, AA3071 19, and AA205892) was re-analyzed by BLAST. BLAST 
analysis identified a fourth EST belonging to this cluster (accession number 

10 AA3 17489); EST AA3 17489 was included in the consensus cDNA sequence. The 

consensus sequence derived from the four ESTs (AA318352, AA3071 19, AA205892, 
and AA3 17489) was compared with genomic sequences obtained from pCA759J12-2, 
PAC 466A11, and shotgun PAC 759J12 subclones using the programs Crossmatch 
and AceDB. This analysis verified the sequence and corrected sequencing errors that 

15 were found in AA318352, AA3071 19, AA205892, and AA317489. Comparison of 
cDNA and genomic sequences revealed a total of 7 exons. The order of the exons 
from 5' end to 3' end was 5'-ex4-ex5-ex6-ex8-ex9-exl0-exl 1-3'. BLASTX analysis 
of the genomic segment located between exons 6 and 8 in PAC 466A11 revealed 
strong homology of the corresponding protein to a group of C. elegans proteins (RFP 

20 family). Since there were no EST hits in the GenBank EST database that covers this 
stretch of genomic sequence, this part of the CG1CE gene was called exH 
(Hypothetical ex 7). This finding changed the order of exons in the CG1CE gene to 
5'-ex4-ex5-ex6-ex7-ex8-ex9-exl0-exl 1-3'. The BLAST analysis of the DNA region 
located upstream of the exon 4 identified an additional human EST (AA326727) with 

25 a high degree of similarity to genomic sequence. Comparison of DNA and genomic 
sequences revealed the presence of two additional exons (ex 1 and ex2) in the CG1CE 
gene. This finding changed the order of the exons in the CG1CE gene to 5'-exl-ex2- 
ex4-ex5-ex6-ex7-ex8-ex9-exl0-exl 1-3'. Bioinformatic analysis did not allow the 
prediction of boudaries between exons 2 and 4, exons 6 and 7, and exons 7 and 8. In 

30 addition, there was no overlap between ESTs represented in exons 1 and 2 from one 
side and exons 4, 5, 6, 7, 8, 9, 10, and 1 1 from another. There was the possibility of 
the presence of additional exons in the CG1CE gene that were not represented in the 
GenBank EST database. 
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Identification of an additional exon and determination of the exact exon/intron 
boundaries within the CG1CE gene. 

To identify additional exon(s) within the CG1CE gene and verify the 
exonic composition of this gene, forward and reverse PCR primers from all known 
5 exons of the CG1CE gene were synthesized and used to PCR amplify CG1CE cDNA 
fragments from human retina "Marathon-ready" cDNA (Clontech, Palo Alto, CA). In 
these RT-PCR experiments forward primer from exl (LF: 

CTAGTCGCCAGACCTTCTGTG) (SEQ.ID.NO.:9) was paired with a reverse 
primer from ex4 (GR: CTTGTAGACTGCGGTGCTGA) (SEQ.ID.NO.: 10), forward 

10 primer from ex4 (GF: GAAAGCAAGGACGAGCAAAG) (SEQ.ID.NO.: 1 1) was 
paired with a reverse primer from ex6 (ER: AATCCAGTCGTAGGCATACAGG ) 
(SEQ.ID.NO.: 12), forward primer from ex6 (EF: ACCTTGCGTACTCAGTGTGGA 
) (SEQ.ID.NO.: 13) was paired with a reverse primer from ex8 (AR: 
TGTCGACAATCCAGTTGGTCT) (SEQ.ID.NO.: 14), forward primer from ex8 (AF: 

15 CCCTTTGGAGAGGATGATGA) (SEQ.ID.NO.: 15) was paired with a reverse 

primer from exlO (CR: CTCTGGCATATCCGTCAGGT) (SEQ.ID.NO.: 16), forward 
primer from ex 10 (CF: CTTCAAGTCTGCCCCACTGT) (SEQ.ID.NO.: 17) was 
paired with a reverse primer from exl 1 (DR: GCATCCCCATTAGGAAGCAG) 
(SEQ.ID.NO.:18). 

20 

A 50 |Lil PCR reaction was performed using the Taq Gold DNA 
polymerase (Perkin Elmer, Norwalk, CT) in the reaction buffer supplied by the 
manufacturer with the addition of dNTPs, primers, and approximately 0.5 ng of 
human retina cDNA. PCR products were electrophoresed on a 2% agarose gel and 

25 DNA bands were excised, purified and subjected to sequence analysis with the same 
primers that were used for PCR amplification. The assembly of the DNA sequence 
results of these PCR products revealed that: 

(i) exons 1 and 2 from one side and exons 4, 5, 6, 7, 8, 9, 10, and 
1 1 indeed represent fragments of the same gene 

30 (ii) an additional exon is present between exons 2 and 4 (named 

ex3) 

(iii) exon 7 (Hypothetical) predicted by the BLASTX analysis is 
present in the CG1CE cDNA fragment amplified by EF/AR primers. 
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Comparison of the DNA sequences obtained from RT-PCR fragments 
with genomic sequences obtained from pCA759J12-2, PAC 466A11, and shotgun 
PAC 759J12 subclones was performed using the programs Crossmatch and AceDB. 
This analysis confirmed the presence of the exons originally found in five ESTs 
5 (AA3 18352, AA3071 19, AA205892, AA3 17489, and AA326727) and identified an 
additional exon (exon3) in the CG1CE gene. Exact sequence of exon/intron 
boundaries within the CG1CE gene were determined for all of the exons. The splice 
signals in all introns conform to publish consensus sequences. The CG1CE gene 
appears to span at least 16 kb of genomic sequence. It contains a total of 1 1 exons. 

10 

Two splice donor sites for intron 7 . 

Two splicing variants of exon 7 were detected upon sequence analysis 
of RT-PCR products amplified from human retina cDNA with the primer pair EF/AR. 
Two variants utilize alternative splice donor sites separated from each other by 203 
15 bp. Both splicing sites conform to the published consensus sequence. 

Identification of 5' and 3' ends of CG1CE cDNA 

RACE is an established protocol for the analysis of cDNA ends. This 
procedure was performed using the Marathon RACE template from human retina, 
20 purchased from Clontech (Palo Alto, CA). cDNA primers KR 
(CTAAGCGGGCATTAGCCACT) (SEQ.UXNO.: 19) and 

LR(TGGGGTTCCAGGTGGGTCCGAT) (SEQ.ID.NO.:20) in combination with a 
cDNA adaptor primer API (CCATCCTAATACGACTCACTATAGGGC ) 
(SEQ.ID.NO.:21) were used in 5'RACE. cDNA primer DF 

25 (GGATGAAGCACATTCCTAACCTGCTTC) (SEQ.ID.NO.:22) in combination 
with a cDNA adaptor primer API (CCATCCTAATACGACTCACTATAGGGC ) 
(SEQ.ID.NO.:21) was used in 3'RACE. Products obtained from these PCR 
amplifications were analyzed on 2% agarose gels. Excised fragments from the gels 
were purified using Qiagen QIAquick spin columns and sequenced using ABI dye- 

30 terminator sequencing kits. The products were analyzed on ABI 377 sequencers 
according to standard protocols. 
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EXAMPLE 2 

Best's macular dystrophy is associated with mutations in an evolutionarily conserved 
region of CG1CE 

Genomic DNA from BMD patients from two Swedish pedigrees 
5 having Best's macular dystrophy (families SI and SL76) was amplified by PCR using 
the following primer pair: 

exG_left AAAGCTGGAGGAGCCGAG (SEQJD.NO.:23) 
exG_right CTCCACCCATCTTCCGTTC (SEQ.ID.NCX:24) 

This primer pair amplifies a genomic fragment that is 412 bp long and contains exon4 
10 and adjacent intronic regions. 

The patients were: 

Family SI: 

Sl-3, a normal individual, i.e., not having BMD; sister of SI -4 
Sl-4, an individual heterozygous for BMD; and 
15 Sl-5, an individual homozygous for BMD. 

Patients Sl-4 and Sl-5 had the clinical symptoms of BMD, including morphological 
changes observable upon ophthalmologic examination. 
Family SL76: 

SL76-3, an individual heterozygous for BMD; mother of SL76-2 

20 SL76-2, an individual heterozygous for BMD, son of SL-3. 

PCR products produced using the primer sets mentioned above were 
amplified in 50 uJ reactions consisting of Perkin-Elmer 10 x PCR Buffer, 200 mM 
dNTP's, 0.5 ul of Taq Gold (Perkin-Elmer Corp., Foster City, CA), 50 ng of patient 
DNA and 0.2 of forward and reverse primers. Cycling conditions were as 

25 follows: 



L 94°C lOmin 

2. 94°C 30 sec 

3. 72°C 2 min (decrease this temperature by 1.1°C per cycle) 

4. 72°C 2 min 

30 5. Go to step 2 15 more times 

6. 94°C 30 sec 

7. 55°C 2 min 
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8. 72°C 2min 

9. Go to step 6 24 more times 

10. 72°C 7 min 

11. 4°C 

5 Products obtained from this PCR amplification were analyzed on 2% 

agarose gels and excised fragments from the gels were purified using Qiagen 
QIAquick spin columns and sequenced using ABI dye-terminator sequencing kits. 
The products were analyzed on ABI 377 sequencers according to standard protocols. 

The results are shown in Figure 6. Figure 6 shows a chromatogram 

10 from sequencing runs on the PCR fragments from patients S 1-3, SI -4, and SI -5. The 
six readings represent sequencing of both strands of the PCR fragments from the 
patients. As can be seen from Figure 6, the two patients affected with BMD, patients 
Sl-4 and S 1-5, both carry a mutation at position 383 of SEQJD.NO.:2. Both copies 
of the CG1CE gene are mutated in homozygous affected Sl-5, while heterozygous 

15 affected Sl-4 contains both normal and mutated copies of the CG1CE gene. This 
mutation changes the codon that encodes the amino acid at position 93 of 
SEQ.D3.NO.:3 from TGG (encoding tryptophan) to TGC (encoding cysteine). Patient 
Sl-3, a normal individual, has the wild-type sequence, TGG, at this codon. This 
disease mutation that changes this TGG codon to a TGC codon was not found upon 

20 sequencing of 50 normal unrelated individulas (100 chromosomes) of North 
American descent. 

Both patients from family SL76 carry a mutation at position 357 of 
SEQ.ID.NO.:2. This mutation changes the codon that encodes the amino acid at 
position 85 of SEQ.ID.NO.:3 from TAC (encoding tyrosine) to CAC (encoding 

25 histidine). This disease mutation that changes this TAC codon to a CAC codon was 
not found upon sequencing of 50 normal unrelated individulas (100 chromosomes) of 
North American descent. 

Amino acid positions 85 and 93 of the CG1CE protein are 
evolutionarily conserved. Figure 7 demonstrates that position 93 is occupied by 

30 tryptophan not only in the CG1CE protein, but also in 15 of 16 related C. elegans 

proteins. The lone C. elegans protein in which this residue is not tryptophan contains 
an isofunctional phenylalanine instead. Phenylalanine and tryptophan, both being 
hydrophobic, aromatic amino acids, are highly similar. Position 85 is occupied by 
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tyrosine and isofunctional phenylalanine in all 16 related C. elgans proteins. 
Phenylalanine and tyrosine, both being aromatic amino acids, are highly similar. 



EXAMPLE 3 



5 Expression of CG1CE 

RT-PCR: RT-PCR experiments were performed on "quick-clone" 
human cDNA samples available from Clontech, Palo Alto, CA. cDNA samples from 
heart, brain, placenta, lung, liver, skeletal muscle, kidney, pancreas, and retina were 
amplified with primers AF (CCCTTTGGAGAGGATGATGA) (SEQ.ID.NO.:15) and 
10 CR (CTCTGGCATATCCGTCAGGT) (SEQ.ID.NO.: 16) in the following PCR 



conditions: 






1. 


94°C 


10 min 


2. 


94°C 


30 sec 


3. 


72°C 


2 min (decrease this temperature by 


4. 


72°C 


2 min 


5. 


Go 


to step 2 15 more times 


6. 


94°C 


30 sec 


7. 


55°C 


2 min 


8. 


72°C 


2 min 


9. 


Go to step 6 19 more times 


10. 


72°C 


7 min 


11. 


4°C 





1.1 °C per cycle) 



The CG1CE gene was found to be predominantly expressed in human retina and brain 

25 Northern blot analysis: Northern blots containing poly(A-h)-RNA 

from different human tissues were purchased from Clontech, Palo Alto, CA. Blot #1 
contained human heart, brain placenta, lung, liver, skeletal muscle, kidney, and 
pancreas poly(A+)-RNA. Blot #2 contained stomach, thyroid, spinal cord, lymph 
node, trachea, adrenal gland, and bone marrow poly(A+)-RNA. 

30 Primers CF (CTTCAAGTCTGCCCCACTGT) (SEQ.ID.NO.: 17) and 

exC_right (TAGGCTCAGAGCAAGGGAAG) (SEQ.ID.NO. :25) were used to 
amplify a PCR product from, total genomic DNA. This product was purified on an 
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agarose gel, and used as a probe in Northern blot hybridization. The probe was 
labeled by random priming with the Amersham Rediprime kit (Arlington Heights, IL) 
in the presence of 50-100 |LiCi of 3000 Ci/mmole [alpha 32p]dCTP (Dupont/NEN, 
Boston, MA). Unincorporated nucleotides were removed with a ProbeQuant G-50 
5 spin column (Pharmacia/Biotech, Piscataway, NJ). The radiolabeled probe at a 

concentration of greater than 1 x 106 cpm/ml in rapid hybridization buffer (Clontech, 
Palo Alto, CA) was incubated overnight at 65°C. The blots were washed by two 15 
min incubations in 2X SSC, 0.1% SDS (prepared from 20X SSC and 20 % SDS stock 
solutions, Fisher, Pittsburgh, PA) at room temperature, followed by two 15 min 

10 incubations in IX SSC, 0.1% SDS at room temperature, and two 30 min incubations 
in 0.1X SSC, 0.1% SDS at 60°C. Autoradiography of the blots was done to visualize 
the bands that specifically hybridized to the radiolabeled probe. 

The probe hybridized to an mRNA transcript that is uniquely 
expressed in brain and spinal cord. 

15 Mouse probe for the murine ortholog of the GC1CE gene was 

generated based on the sequence of an EST with GenBank accession number 
AA497726. The 246 bp probe was amplified from mouse heart cDNA (Clontech, 
Palo Alto, CA) using the primers mouseCGlCE_L 

(ACACAACACATTCTGGGTGC) (SEQ.ID.NO.:26) and mouseCGlCE_R 
20 (TTCAGAAACTGCTTCCCGAT) (SEQ.ID.NCX:27). Due to an extremely low 

expression level of the CG1CE gene in mouse heart, repetitive amplification steps 
were used to generate this probe. The authenticity of this probe was verified by 
sequence analysis of the gel purified DNA band. Northern blot containing poly(A-h)- 
RNA from several rat tissues (heart, brain, spleen, lung, liver, skeletal muscle, kidney, 
25 testis) was purchase from Clontech, Palo Alto, CA. The probe hybridized to an 
mRNA transcript that is expressed in testis only. 

The present invention is not to be limited in scope by the specific 
embodiments described herein. Indeed, various modifications of the invention in 
addition to those described herein will become apparent to those skilled in the art 
30 from the foregoing description. Such modifications are intended to fall within the 
scope of the appended claims. 

Various publications are cited herein, the disclosures of which are 
incorporated by reference in their entireties. 
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WHAT IS CLAIMED: 

1. An isolated DNA comprising nucleotides encoding a 
polypeptide having an amino acid sequence selected from the group consisting of 

5 SEQ.ID.NO.:3, SEQ.ID.NO.:5, and SEQ.ID.NO. :29. 

2. The DNA of claim 1 comprising a nucleotide sequence 
selected from the group consisting of: SEQ.ID.NO.:!, SEQ.ID.NO.:2, SEQ.ID.NO.:4, 
SEQJD.NO.:28, positions 105-1,859 of SEQ.ID.NO.:2, positions 105-1,409 of 

10 SEQ.ID.NO.:4, and positions 1 1-1,663 of SEQ.ID.NO.:28. 

3. An isolated DNA comprising a sequence that is identical to 
SEQ.ID.NO. :2 except that it contains a differennt nucleotide at a position selected 
from the group consisting of positions 120, 121, 122, 357, 358, 359, 381, 382, 383, 

15 783, 784, 785, 999, 1000, and 1001. 

4. An isolated DNA that hybridizes under stringent conditions to 
a nucleotide sequence selected from the group consisting of: SEQ.ID.NO.: 1, 
SEQ.ID.NO.:2, SEQ.ID.NO.:4, and SEQ.ID.NO.:28. 

20 

5. An expression vector comprising the DNA of 

claim 1. 

6. A recombinant host cell comprising the DNA of claim 1. 

25 

7. A CG1CE protein, substantially free from other proteins, 
having an amino acid sequence selected from the group consisting of SEQ.ID.NO.: 3, 
SEQ.ID.NO.:5, and SEQ.ID.NO.: 29. 

30 8. The CG1CE protein of claim 8 containing a single amino acid 

substitution. 

9. The CG1CE protein of claim 9 where the substitution occurs at 
position 6, 85, 93, 227, or 299. 
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10. The CG1CE protein of claim 9 where the substitution is a 
conservative substitution. 

5 11. The CG1CE protein of claim 8 containing two amino acid 

substitutions. 

12. The CG1CE protein of claim 8 containing an amino acid 
substitution where the substitution does not occur in a position where the amino acid 

10 present in CG1CE is also present in the corresponding position in one of the C 
elegans proteins whose partial amino acid sequence is shown in Figure 7. 

13. An antibody that binds specifically to a CG1CE protein where 
the CG1CE protein has the amino acid sequence selected from the group consisting of 

15 SEQ.ID.NO.:3 and SEQ.ID.NO.:5. 

14. A method of diagnosing whether a patient carries a mutation in 
the CG1CE gene that comprises: 

(a) providing a DNA sample from the patient; 
20 (b) providing a set of PCR primers based upon SEQ.ID.NO.:2 or 

SEQ.ID.NO.:4; 

(c) performing PCR on the DNA sample to produce a PCR 
fragment from the patient; 

(d) determining the nucleotide sequence of the PCR fragment from 

25 the patient; 

(e) comparing the nucleotide sequence of the PCR fragment from 
the patient with the nucleotide sequence of SEQ.ED.NO.:2 or SEQ.ID.NO.:4; 

where a difference between the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID.NO.:2 or 
30 SEQ.ID.NO.:4 indicates that the patient carries a mutation in the CG1CE gene. 

15. The method of claim 15 where the DNA sample is genomic 

DNA. 
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16. The method of claim 15 where the DNA sample is cDNA. 

17. A DNA or RNA oligonucleotide probe comprising at least 18 
contiguous nucleotides of at least one of a sequence selected from the group 
consisting of: SEQ.ID.NO.:l, SEQ.ID.NO.:2, SEQ.ID.NO.:4, and SEQJD.NO.:28. 

18. A method for determining whether a substance is an activator 
or an inhibitor of a CG1CE protein or a mutant CG1CE protein comprising: 

(a) recombinantly expressing CG1CE protein or mutant CG1CE 
protein in a host cell; 

(b) measuring the biological activity of CG1CE protein or mutant 
CG1CE protein in the presence and in the absence of a substance suspected of being 
an activator or an inhibitor of CG1CE protein or mutant CG1CE protein; 

where a change in the biological activity of the CG1CE protein or the 
mutant CG1CE protein in the presence as compared to the absence of the substance 
indicates that the substance is an activator or an inhibitor of CG1CE protein or mutant 
CG1CE protein. 
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TITLE OF THE INVENTION 

BEST'S MACULAR DYSTROPHY GENE 

ABSTRACT OF THE DISCLOSURE 
5 Novel human and mouse DNA sequences that encode the gene 

CG1CE, which, when mutated, is responsible for Best's macular dystrophy, are 
provided. Provided are genomic CG1CE DNA as well as cDNA that encodes the 
CG1CE protein. Also provided is CG1CE protein encoded by the novel DNA 
sequences. Methods of expressing CG1CE protein in recombinant systems are 
10 provided. Also provided are diagnostic methods that detect patients having mutant 
CG1CE genes. 
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ccaaaaaatt 
attttgggca 
tggtgggttc 
gtaaccaacc 
ctgcacacgt 
aaaatcacct 
agagggtgac 
tattataaaa 
ataatcccct 
ctcagactct 
ggcgattctt 
tgctctcttc 
acttccaaca 
ttattgaatc 
catttattcc 
cagggctgtt 
ggcaggcttc 
cagccaatca 
gctcctcgtc 
ttcccaagca 
gagctgaaac 
agaaaccagg 
acagggctgt 
agagttcctg 
ctctctctgc 
agcctctgga 
gccctggtct 
ccaggctgtg 
acaaggactc 
ctcagcactc 
ggccacagag 
GGGATCATCG 



1/18 

gttctcttgg gggttggggc gacaagcggg aagggagggc 
aattggctta ttgccacgca agggctttaa caccttaggt 
acaggttgca ggcaacccac catggcacac gtatacctat 
tgcaccatca tgtataccta tgtaaccaac ctggtacatt 
atcccaggac tttagagtga aaaaaaaagt ggtgtgtaga 
gcaatctcag catagttaac gcttagtaca tttcagagag 
aggaaaggga ggatgagagt gggtttaaga .cacaaggtca 
tcagggcttc tggaagttta gtcccaaaac cacacatctc 
gcagtgcttg attaaaatgc aacatcacta aggccacaga 
ggagaaagat ccagaaaact gcccgtttaa taaacatttg 
acggcctcta aagaccaaga accactgctg cctagagctc 
attgaacaat acaagaggag tgtgtaggta gacacccacc 
gcttaggaga gcccttgagt atggattgat gtattaaaat 
acatgctgag attttcacca gctgcccgtg gggatctggg 
catattgcac tggctggctg gaagccagca gcataaactc 
ctgtcaaccc ccaccagact cacccccctc caccagcccc 
tccttccatc tctctgaagc aacttactga tgggccctgc 
cagccagaat aacgtatgat gtcaccagca gccaatcaga 
agcatatgca gaattctgtc attttactag ggtgatgaaa 
acaccatcct tttcagataa gggcactgag gctgagagag 
ctacccgggg tcaccacaca caggtggoaa ggctgggacc 
actgttgact gcagcccggt attcattctt tccatagccc 
caaagacccc agggcctagt cagaggctcc tccttcctgg 
gcacagaagt tgaagctcag cacagccccc taacccccaa 
aaggcctcag gggtcagaac actggtggag cagatcattt 
ttttagggcc atggtagagg gggtgttgcc ctaaattcca 
cagcccaaca ccctccaaga agaaattaga ggggccatgg 
ctagccgttg cttatgagca gattacaaga agggactaag 
ctttgtggag gtcctggctt agggagtcaa gtgacggcgg 
acgtgggcag tgccagcctc taagagtggg caggggcact 
tcc CAGGGAG TCCCACCAfiC CTAGTCGCCA GACCTTCTGT 



ggtggactgg 
cgtatttgtg 
ggcacctgtg 
gcatcaggag 
aagaatggag 
ctaaaaaaaa 
ttgtccttat 
gaggacceaa 
ctttagaggg 
aggagcggcc 
ctctctcaaa 
taaaactcca 
ccttctacga 
ggaaacaggc 
agaaatatca 
tctatttcca 
agtgggaagg 
aagatcagca 
tgagcatctc 
ggacagggct 
agagcttggg 
gaggacctga 
ttgtattatt 



GACCCACCT g gaaccccacc 



gatggggctt tgaggccttc 
tgggatatgc acacacaggc 
tgtctgtgca aatgccctga 
cgacagccag ccagtgtggc 
ggggcatcaa tcactgacaa 
gaaggtctct tctttcgata 
aaatataagg gaggagccgc 
gaccccgtgg gttgtgtgtt 
agcgtgggag aaccgctgta 
gcccaaaaaa tatccctccc 
aagatgaaga ggaagccgga 
ggtagnnnnn nnnnntgctt 
gaacacaaga ggagcttcca 
agatatcctg tataatttca 
agcaaggtga ggagacacag 
ggttggatgg ttgggaacat 
aaccatgcag gtatctcagg 
ggtggaaagg ccctggagcc 
taccagctag gttccattat 
gcctggtccc ttccatactt 
agctaacgaa caagatgggc 
gcttagtgtg tagacattgc 
tatttattta tttattgatc 

FIG.1A 



tgtgagtaca 
agggttggat 
agcacatgcg 
ggtgggaatg 
tgcagcaaaa 
aattatttat 
gaagaaggga 
ccctcaaaaa 
ttccaggggg 
ttcaggcctc 
gggcgataag 
gttgtatgtg 
cagtaaattt 
ttctgaggag 
agtagtgata 
agcaccggtg 
cctttctaaa 
aagagcttcc 
accattcagt 
gggaatggga 
ctcacactag 
tgagaacact 
tgctgttact 
ttaagacaga 



aggtgcccca 
ggccatcttg 
caggtgtgtg 
agcttggtgt 
cacacaggga 
agagctcccc 
gagagggggt 
ataagggagg 
agctcgaacc 
tcgagagaaa 
aaatggtggc 
ttgatatttt 
ttattgagcg 
gaaacaggca 
agtgctctct 
gcagtggggc 
gggaacctgg 
tccaggcagg 
aaacatcatt 
atatggtggt 
ggtggttgag 
gcctagccca 
gcctttgtcg 
gttttgctct 
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2751 tcttacccag gcttgagtgc aatggcgtga tctcagctca ctgcaacctc 
2801 cacctcctgg gatcaagcga ttctcctgcc tcagcctcct gagtagctgg 
2851 gattacaggc acccgcacca cgcctggata atttttttgt atttttagta 
2901 gagacagggt ttcaccatgt tggacaggct ggtctcgaac tcctgacctt 
2951 aggtgatcca cctgcctcga cttcccaaag tgatgggatt ataggcatga 
3001 gccactgcgc ccagtgatta tagaaagtta aaggcacatg gcaatgcaca 
3051 cgcctatcta cgtcttccct gccaaagcaa agggcagcct ctgggctcac 
3101 tttcttgcgt ttctacttcc aaaaggcagt cagaactggc agggccttgg 
3151 agaccacttc atccacctcc tagggtccct atgggagagt tgaggtccag 
3201 agcagggaag ggtcctgaca ggctctgacc agggcctctg atccctacaa 
3251 acccccaatc ggtgtccctc tctaccagfiA CCCAAGCCCA CCTGCTGCAG 
3301 CCCACTGCCT GGCC4TGACC ATCACTTACA CAAGCCAAGT GGCTAATGfX. 
3351 CGCTTAGGCT CCTTCTCCCG CCTGCTGCTG TGCTGGCGGG GCAGCATCTA 
3401 CAAGCTGCTA TATGGCGAGT TCCTAATCTT CCTGCTCTGC TACTACATCA 
3451 TCCGCTTTAT TTATAG qtaa agctggcagg gctgggccgg ggggcctggg 
3501 aaggatgtgg ctggggctgg gagctgggag ctcctggggg cctcccagcc 
3551 agctcagggc ccagtgcacc agtccactac aacactaagc tgggctcctg 
3601 accagctcct gggcactgga gctgaggctg cgcgctgggg gctgggcaga 
3651 gtaaagaagt cacactgaga ggatgctcaa gccaggccag cagggtttta 
3701 gccacccttc ctccaacccc aggaggaccc ctggagccca ggctttgtct 
3751 ggccccactc tactggcctg ttttactgaa tcccacacag actcataggc 
3801 ccacatagta cattaaaaaa gagagagaga gagagagaga gagagagatg 
3851 gagtctcact gtgttgtcca ggctggtctc gaactcctag gctcaagcaa 
3901 tccccctgcc ttagcctccc aaggggctgg gattacaggt gtgagctact 
3951 gcacttgacc aaccacatgg tacttttttt tttttttttt ttttttgaga 
4001 cagggtttca ctccatcacc caggctggag tgcagtgggg gcaatcttgg 
4051 ctcactgtaa cctctgcctc ccaggtgcaa gcgattctcc tgccttagcc 
4101 tcctgagtag ctggaattat aggcacacac caccacgcct ggctaatttt 
4151 tttttttttc tgtattttta gtagagacag ggtttcatca tgttggacag 
4201 gctggtcttg aacccctgac ctcaagtgat ccacccacct cggcctccca 
4251 aagtgctggg attacaggtg tcagccacca tgcacagccc acatggtaca 
4301 ttttttaaaa ttatttttta attaaantgt ttatctaagg ccagtagcag 
4351 tgactcgcgt ctgtaatccc agcactttga ggggccaagg tgcggggatc 
4401 acttgagcct gggagttcag cgtgggcaac atagtgagac cccgtctcta 
4451 ccaaaaattt aaaaaattag ctgggagtgg tggcatttgc ctgtggtccc 
4501 agctacttgg gaagctgagg tgtggggatg gctgaagcct gtgaggtcga 
4551 ggctgcagtg agctatgatc acaccactgc acttcagcct gagtgacagg 
4601 ctatctcaaa agcaaacaaa ataatgttta tctaaacggt aaggtataat 
4651 cacagaatat atgatagcat tttaaattga aaaagcatta atgattacat 
4701 ggattgtaaa atatcaaata catgaaattc ttgtgttctt aataatgcta 
4751 gcaacaaggc acatttggtt tttactaggg caccaaggta ctttaaaaaa 
4801 agttagggcc agccacaggg gctcacacct gtaatcccag cactttggga 
4851 ggccaaggca ggaggatcac ttgagcccag gagtttagga cctgagcaac 
4901 atagggagat cctgatcttg tctctataaa aaattaaaaa attggctagg 
4951 ccctttggct tacacccgta atcccagcac tttgggaggc cgaggcgggt 
5001 ggatcatgag gtcaggagtt caagaccagc ctggccaaca tagtgaaccc 
5051 aatctctact ataaatacaa aaattagccg agtggggtgg cacgcacctg 
5101 tagttccagc tactcaggag gatgaggccg gagaatcgct tgagcccggg 
5151 aggcagaggc tgcagtgagc cgagaccatg ccattgcact ccagcctagg 
5201 tgacagagtg agactccgtc ttaaaataat attaaaatct taaaatgatc 
5251 tgggcatggt ggcttatgcc tgtagtccca cccagctctt caggaggctg 
5301 aagcgggagg attgcttcag cccaggaggt tgaggctgca gtgagtcatg 
5351 actgtgccgc tgcccttgag cctgggtaac agagcaagac cctatctcaa 
5401 aacaaacaaa caaacaaaca aacaaacaaa aaccaataaa ccaaaaacat 
5451 ttatctaaac aataaaataa aggacagata taatcaccga atatatgata 
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gcattttaaa 
aaatacataa 
ggtctttggt 
attttttaga 
ttctcatgtg 
aacaccccca 
tgacacatca 
tgggcagtac 
atcacagcat 
tattcacccc 
agagacagtg 
agcccattgc 
cctccagtgg 
tttgtagaga 
ggtcctgcct 
tgcccgtccc 
ggatgcattc 
cttccttcct 
gaaaccactg 
agcatcatgg 
atgtgggata 
cctggggaca 
ccacccccac 
GAGAAACTGA 
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ttgaaaaagc actaatgact acaatggatt ataaaacatc 
aattcttaag ttcctcctaa taccaaatac aaagcacatt 
ttttacttgg gcaccaatgc atgctgaaaa agagtcgttc 
gtagttttag gttcacagca aaattgagca gaaggtagag 
tctctttgct cctcaccctg cccccagcct ccccactatc 
cactacagtg gtagatttat tacaatccct gaacccacag 
ctatcaccca aagttcatag cgtacagcag ggttcactct 
attccatggg tttggataaa tgtgtaatga tgtctccacc 
caggcagagt agtttcactg ctctaacaaa atcctctgcc 
tctcattaaa gccaaacact ctgtttcctt ttttcctttt 
tctcgctctg tcaaccaggc tgaagtgcaa tggcaatcac 
agcctccaac tcctgggctc aagtgatcct cctatctcag 
ctacgactgc aggcatacgg caacggcacc caactaattt 
tagggtcttg ctatgttgac caggatggtc ttgaactctt 
tagcctccca gagctctggg attacaggcg tgaaccaccg 
aaacactctg tttcgacctg cttttaaaca actgaccctt 
aaaggatcag ggtgtctgaa actggcctct gcagcaggac 
acacatctcc cagtggccag tgtgaggatt ctccccacaa 
gagggggcct cctcctgtcc gggtttgggg ctgtacaagg 
acctggctca ggcctcagga ggggccctgg gctggggaaa 
gcatcgaggc agtcccactc ctacccaggg ccgggctaga 
gtctcagcca tctcctcgct gcgtccacac aattccaccc 
ccccagQCIG GCCCTCACGG AAGAACAACA GCTGATG1 



CTTCGTGCTG 
gcccaggctc 
ggctggggag 
agggaaaggt 
agacgtcctg 
acatcatggt 
tttctgggac 
cgcgcctcca 
ggagggctgg 
gcatcgccgg 
ccctcgcccc 
ACCCGCTGGT 



CTCTGTATTG CGACAGCTAC ATCCAGCTCA TCCCCATTTC 



Ggtgagttcc 
cagacaggcc 
ggggcggggg 
gcggactgca 
ccgttagcaa 
ccctggagcc 
cagcaggggg 
tgagaggctc 
gggctaggcc 
gcgctgggcc 
ccgcccctcc 
GGAACCAGTA 



cccttctggc 
aggggaggat 
aacgccagcg 
gccagagaaa 
tgaaaacccc 
cctgcgcggg 
acccccgggt 
tgcctgcctc 
cgctcgcagc 
ctgggctctg 
tgcccag GCT 
CGAGAACCTG 



tgttccgggt 
cacgaggagc 
gcaggtcggc 
ctgaagttag 
attttctgag 
aggggagggg 
gacagaaccc 
tcgctcccga 
agaaagctgg 
gccgcagact 
TCTACGTGAC 



ccctgtggcc 
tgcggcaagg 
gcctctctgt 
acgttaggta 
ggaagcgctg 
gtctggcgga 
ttggggctct 
gcgccttcca 
aggagccgag 
ggcccctcgc 
GCTGGTCGTG 



CCGTGGCCCG ACCGCCTCAT 



GAGCCTGGTG TCGGGCTTCG TCGAAGGCAA GGACGAGCAA GGCCGGCTGC 
TGCGGCGCAC GCTCATCCGC TACGCCAACC TGGGCAACGT GCTCATCCTG 
CGCAGCGTCA GCACCGCAGT CTACAAGCGC TTCCCCAGCG CCCAGCACCT 



GGTGCAAGCA 
ccaggggccg 
tcacccggac 
atttgggggt 
gaggccagga 
acccttgagg 
tactagggtc 
actctggagg 
acctggtcct 
cctgtaatcc 
cagctgtttg 
aaaacattaa 
ctgagtatcg 
aggctgcagt 
agccagaccc 
cacagaacag 



Ggtgggcgga 
agatgggcgc 
tcgggggact 
ccaattgggc 
gcccaccctc 
gataatggaa 
tacttccctc 
tatgggacat 
ggttaataag 
cagtgcttta 
agacgcacct 
aaattagcag 
ggaggctgag 
gcgctaagat 
tttctctgga 
cacctagtag 



ccgggagcaa 
ggcaggaacg 
gggtggagcc 
gggacagagt 
cgagagtagg 
agaagggtga 
tgcccttgcc 
tggtctctga 
acagacccag 
ggaggcaaag 
gagcaacata 
ggcatggtgg 
gcaggaggat 
cgcaccgctg 
aataaataaa 
gtgctcagaa 



cggggaggca 
gaagatgggt 
aggagtgggg 
cgggtgtctg 
agtctgaggc 
cggcttggga 
cctcttgatc 
caccccctca 
gctaggcgtg 
gtgggaagat 
gcgagacccc 
cgtgtgcctg 
cacttgagcc 
cactccaacc 
taccctgccc 
atttttttgt 



ccgggcagag 
ggagccaaag 
tgtggtcaag 
aaggtggggc 
agggataagg 
actggtgagg 
tccggtttcc 
gcctggcctg 
gtggctctcg 
cgcttgagcc 
catctctaca 
tagtctgagg 
cagcagttcc 
tcggtgacag 
acatgctcag 
tgttgaaaga 
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8251 aagaggatgg caaaggagtg ctgaggttcc tataggtcag caggtgccgg 
8301 ccatcccttc tgcaggttct cccacccacc gccttcttca ctccactctg 
8351 cagG CTTTAT GACTCCGGCA GAACACAAGC AGTTGGAGAA ACTGAGCCTA 
8401 CCACACAACA TGTTCTGGGT GCXCTGGGTG TGGTTTGCCA ACCTGTCAAT 
8451 GAAGGCGTGG CTTGGAGGTC GAATCCGGGA CCCTATCCTG CTCGAGAGGr 
8501 TQCTQAAC gt gagcccactg tacagacagg gctgccgcag agtgggaagg 
8551 gttgtggtcc acaggaaaca aggtttccta caaagagaag ccttgggccc 
8601 ctgagggtct tccgagagcc ggaggtgggg ttgcagaatc ttttccaaca 
8651 gcaatccaca gaccgaggtg gtcccttatc agaggcccct ccctcttctc 
8701 caagtctgtg aggtcctggt tcccttttga tagatgagga agctgagaca 
8751 caaagaggtt tagtgagctt cccatggcca cacagccagg aatggaccat 
8801 aggtaccagg ccctggtacc tggagaagag gtgggggcga gcccagggtg 
8851 ggggcaggtg gtgttcagaa ccccatcccc ctcttctgcc ccccaa GAGA 
8901 TGAACACCTT GCGTACTCAG TGTGGACACC TGTATGCCTA CGAGTGGATT 
8951 AGTATCCCAC TGGTGTATAC A£A£gtgagg actaggctgg tgaggctgcc 
9001 cttttgggaa actgaggcta gaaggaccaa ggaagcagct ggggtgggaa 
9051 gggctcacct agaggctaag tggctcccct gggagttggg tccacacttt 
9101 gaagttgggt ctggactttg aagtgccaag ttctaagagt ccaggctcct 
9151 gcctggccca gtccagtaga ggcaatgtga ttatccccat attaaagaga 
9201 ggttggcagg gcacagtggc tcatgcatgt aatcccagca ctttgggaag 
9251 ctgaggcagg tggatcacct gaggtcagga gttcgagacc agcctggcca 
9301 acatggtgaa accccatctc tactgaaaat acagaattag ctgtgtggtg 
9351 gtgcacgcct gtaatcacag ctacttggga ggctgaggca ggagaatagc 
9401 ttgaacccgg gaggtggagg ttgcagtgag ctgagatcat gccactgcac 
9451 tccagcctgg gcgacacagc aagactctgt ctcaaacaaa caaacaaaca 
9501 aacaaacaaa caaacaaaca aaggggttaa cagagcccct aagtcacata 
9551 agtgtgcaag tcagaacaag gccttggtct cctgtctcag actcccagcc 
9601 cctggagcat cctgatttca gggttcccac ctagcccttt gctaccacat 
9651 cctcctcctc ctcctcctcc tcccao GTGG TGACTGTGGC GGTGTACAGC 
9701 TTCTTCCTGA CTTGTCTAGT TGGGCGGCAG TTTCTGAACC CAGCCAAGGC 
9751 CTACCCTGGC CATGAGCTGG ACCTCGTTGT GCCCGTCTTC ACGTTCXTGC 
9801 AGTTCTTCTT CTATGTTGGC TGGCTGAAGG TGGGCCTCTC CAGGGCCCTG 
9851 CTGGGCTGGA GGCATGGCCA GAGGGGTCAT GGCCAGCAGC TGCTTGAGAC 
9901 GAGGATGCAG TGTCAGGAAA GGAAGGTCTC ACGGGTAGAA AGCAGCCAGG 
9951 CGTGGTGGCG CACACCTGTA ATCCCAGCTA CTCGGGAGGC TGAGGCAGGA 
10001 GAATCGCTTG AACCCGGGAG GCGGAGGTTG .ffigtgagttg agatcgtgcc 
10051 actgcactcc agcctgggca aaagaatgaa actctatctc aaaaacaaca 
10101 acaacaacaa aacaaagccc taaggttcag aagcccctgc cctttagaag 
10151 cagagcgaac actctcctat taagatgctg ttgggtgtct ttttcactca 
10201 gtagctgtcc agtattctcc acacagcata atagacagat tctaatacaa 
10251 atttcttcaa ctcttaattc ctcctttgtg ccacdatttt ttcttctacc 
10301 tcctaattta tgaatgggtt agtatgctct gcttctgcat tgagacaaaa 
10351 tacagagaga gagaaagatc tatcttaatc ccgccccatt ttagttggaa 
10401 aaaaacttta ttaaatcagg caagtaaaat ccgccaagga ttgnnnnnnn 
10451 nnnagatgtt ctgaatcaga gagttttctc tcgagctctt tatctttcct 
10501 tccttatgtt gcccacccac tctctctcac ttcctacctt cctttatttt 
10551 ttggtaatgg gggtgtaagt ctctgtctct gcccttcctg tcactgtgac 
10601 acacacacac acacacacac acacacacac acacacacac attcatattc 
10651 ctctaaattc cccctgcacc cccagttatc tttggtttct gcagatcaaa 
10701 acaaatcaca cttttatgct tgaaattctc cagggtgccc cagtggcctg 
10751 caagatgtcc cctggacccc taaggcagac gcgtgtcacc tcttcggggc 
10801 tttgttaggg cattttagag gttgctatcc aggaatctgc ccacctagac 
10851 tgccctttag ttcagcccag cttcagtata tatctctgtt gcatgaatga 
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ataaaattat 
gactcagccg 
actggctcag 
acaagtgtgg 
aggcaggaag 
agggggaagc 
cctgtcccca 
TGATGATTTT 
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gcaactccag gtaagataca 

agtgatacac tcagggacag 

aagagttaga ggggctgtgt 

ggggctggag ccctaaactc 

ggcttcatgg ggtgtggaaa 

tggctttgag gagttctgcc 

agGT GQCAGA GCA GdCATC 



tgaggtgaga 
ctglgggtgt 
ccagaagtgt 
tgcctttgaa 
tagcagcagc 
tgagggttta 
AACCCCTTTQ 



GAGACCAACT GGATTGTCGA CAGGAATTTG 



taaaggcagt 
tcagggaagg 
gtgggtgcct 
gacagtggtc 
tgaggtttaa 
cagagcctca 
GAQ A GGATG A 



gagagggaga 
gagctcctcc 
atctttgagg 
aactgaggtc 
caaggtcctg 
qaccaq GTGT 
GATGGAGCCG 



gaaaccatac 
ctcctgcagc 
ctgcaggcag 
cagagagagg 
cctgggatga 
CCCTGTTGGC 



catggacctt 
cagtcattca 
gcacccatct 
gagagatCcc 
tctttctgtg 
TGTGGATGAG 



ccccaaagtg 
ctcacaggat 
ccccatttca 
tccaagtcat 
ggacttcttc 
ATGCACCAGG 



CAGgtatggg 
gacccaaaga 
tctcacctca 
caggcaggga 
caggcacata 
tgtccctggt 
ACCTGCCTCG 



GACATGTACT GGAATAAGCC CGAGCCACAG CCCCCCTACA 



CAGCTGCTTC CGCCCAGTTC CGTCGAGCCT CCTTTATGGG CTCCACCTTr 



AACATCAG gt 
tgcaggggtc 
ttgcttcagt 
tatttttttc 
attaaagtac 
caaatggtgc 
ggaacgttag 
tttgagacag 
tcttggctca 
tcagcctccc 
aatttttgta 
gtctccaact 
ggaattatag 
tgggaagtgg 
cagcaggcag 
ccgagtaaag 
gtcccacttc 
aagggctatc 
gccttggaga 
ccacggtatc 
aacggagttt 
cggctcactg 
gcatcctgag 
ttttgtattt 
gaactcctga 
gattacatgt 
aatatcetac 
gaggaatggt 
cagaaacatt 
gcagctgaag 
gcatagacct 
ccaagtctca 
agggacagaa 
ccagctgggt 
ggcaagggca 
aggtctgggg 
gtggatgtca 
cactcatggg 



gtggccagag 
tgcctaggaa 
aagtgtcagg 
ctcccaataa 
aggttcagag 
atttgctact 
gacctggctc 
tatctcgctc 
ctgcaacctc 
cagtagctgg 
cttttagtag 
cctgaccagt 
gtgtcaaaac 
aagtggggtt 
ccaggccatc 
ggctcaggcc 
cctgattcca 
ccagctggtc 
gtgttgggca 
cagtgctgtt 
cactcttgtt 
caacctccgc 
tagctgggat 
ttagtagaga 
cctcaggtga 
gtgagccact 
tagactgcaa 
tgggaaggtc 
tctggaggat 
gttgttgagg 
tgtctccaag 
aactctggat 
catggaacac 
ctggagctga 
ggccatactc 
ctcccgggat 
ctcccagttg 
cctcatctga 



ccagggggct 
cttagaatag 
cactgtacta 
ttctggtttg 
agagtaagtt 
cgaaggacag 
ttgtcatcca 
tgtcgcccag 
cgcctcctgg 
gattacaggt 
agatgaggtt 
aatctgcccg 
tatgttttct 
ccctgggatg 
acaggtacct 
acccacagca 
tctgaatccc 
ctttctcccc 
catgtcaggg 
ctcgcttgtt 
gcccagagct 
ctcccagatt 
tataggtgcc 
cagtttcacc 
tccaccctcc 
gtgcctggct 
tcgagtttaa 
atcaaatgaa 
gactttgagc 
gatggggagg 
gaatgcacaa 
acaaggtaca 
agtcatattt 
gccatggaac 
tctggtagat 
gcctgttgct 
gaaccacaaa 
accactcatg 



gggtgggaag 
cactagttaa 
tgctctttat 
ttatcccaag 
gtccaaggcc 
cctatgatca 
gaactatgtt 
gttggagcgc 
gttcaagtga 
gcccacaacc 
tcaccatgtt 
ctttggcctc 
gataagctac 
ggggaggggc 
cctgaattga 
gccagactta 
tcttgagctg 
aggacaacag 
ttcatactca 
cttttctttt 
ggagtgcagt 
caagcaattc 
agccaccaag 
atgttggcca 
tcagcctccc 
gcttgttctt 
ctacagtcta 
ggctggaggc 
cctacjatggt 
gctgaaaaca 
tttatggagg 
aagtactgga 
gtctgcatgg 
atgggaagaa 
aagctttcct 
aggaagtcaa 
ttcctggcat 
ccagggcacc 



cccctcctag 
tgcatacagg 
aaacattaac 
ttthcagata 
acatagctac 
gtgatgcagt 
ttcttttctt 
agtggcgtga 
ttctcctgct 
acaactggct 
ggccaggctg 
ccaaaatgct 
gatgcttgga 
agcaaagtcc 
ctttgtccta 
tccccacatg 
cagtgggctg 
agttgaaagt 
agggtttctt 
ttttttttta 
ggcataatct 
tcctgcctca 
cccggctaat 
ggctggtctc 
aaagtgctgg 
ttaagaacca 
tagatactgt 
ttgcttaggt 
ctgtacccca 
gaacgataaa 
gagctcaaac 
tgtccagaaa 
gaggcggctt 
tctgaacttg 
tgcagggtaa 
atttctcttt 
tgcccagagt 
agtgtttctg 
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13601 

13651 

13701 

13751 
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13851 
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14201 

14251 

14301 

14351 

14401 

14451 

14501 

14551 

14601 

14651 

14701 

14751 

14801 

14851 

14901 

14951 

15001 

15051 

15101 

15151 

15201 

15251 

15301 

15351 

15401 

15451 

15501 

15551 

15601 

15651 

15701 

15751 

15801 

15851 

15901 

15951 

16001 

16051 

16101 



actgcctgga 
ttacacgcca 
gagggaatca 
aggtacagta 
gacggtgtgg 
ggagaagtaa 
tctcctgttt 
ATCAGGAGGA 



gtgaggggtt 
ggaggggtgg 
acaaacagtg 
cagatcagga 
ccttggcttg 
ggccaggtgt 
ctttccag££ 
CGAGGAGGAT 



ttacagggga 
ttgcgggggt 
aggtgagctg 
gagaggtgag 
ggccaactga 
tggtcctttg 
TGAACAAAGA 



agtgaatgat 
tggatgttaa 
ggcctggagg 
agctggggca 
gagagaggag 
tccactggct 
GGAGATGGAG 



gaggaggcct 
ctctggtcaa 
gatcaccggg 
tggtgaggaa 
cgggggtaag 
cagccctgca 
TTCCAGCCCA 



GCTCACGCTG G CATCATTGG C.C.KC.TTC.C.Tb 



GGCCTGCAGT CCCATGATCA CCATCCTCCT. AGGGCAAACT CAAGGACCAA 
ACTACTGTGG CCCAAGAGGG AATCCXTTCT CCACGAGGGf CTGflff AAAA 
ACCACAAGGC AGCCAAACAG AA CGTTAGGG GCCAGGAAGA GAAf AAGGfT. 
TGGAAGCTTA AGGCTGTGGA CGCCTTCAAG TCTGCCCCAC TGTATGAGAG 
GCCAGGCTAC TACAGTGCCC CACAGACGCG C.C.1CAGCC.CC. ACTGCr.ATGT 
TCTTCCCCCT AGAACCATCA GCGCCGTCAA AGCTTCACAG TGTr.Ar.AGGr 
ATAGACACCA AAGACAAAAG CTTAAAGACT GTGAGTTCTG GGGCCAAGAA 
AAGTTTTGAA TTGCTCTCAG AGAGCGATGG GGGCTTGATG GAGr.Arrr.AG 
AAGTATCTCA AGTGAGGAGG AAAACTGTGG AGTTTAACCT GAr.GGATATG 
CCAGAGATCC CCGAAAATCA GCTCAAAGAA CCTTTGGAAC AATGAGr.AAr. 
CAACATAGAG ACTACACTCA AAGATGAGAT GGATCCTTAT TGGGCCTTGG 



AAAACAG atr. 
cccaccccag 
agggttccat 
ggggtatata 
ccttttctca 
tgaaggaaga 
agggctgaca 
ttactttgag 
ggatgacaga 
actacaggaa 
ggtgaggttg 
gaacctcacc 
acaaaatcag 
aattataaac 
tcccttttct 
ctattatgat 
gctaagacag 
tcatatttaa 
aggaggtcag 
tctaccaaaa 
ccaacgcagg 
tgcagtgaga 
gtctcaaaaa 
caacattttg 
tactttcatt 
GGGGATGCTT 



tgtcctccac 
cttcccttgc 
cactgccaga 
cttggccacc 
cttcaccctg 
tgaggttgtg 
ggccaggctt 
caagggtggc 
tgaacacttc 
agggtggcag 
agggtgtcca 
aaaatacttc 
atatttccct 
accccacttc 
ggattctcaa 
tgaaacctta 
gaacttggca 
gaatcttgtc 
gagtttgaga 
aaaatacaaa 
aggttgaggg 
ttgagcaact 
aaaaaaaaaa 
gtatttgaaa 
ctcactagSfi 
CGCCAGCCAG 



ctgaaccagg 
tctgagccta 
gcacactgga 
ttcacaggga 
gtatcacccg 
ctgaccagaa 
agctgagcag 
tgacccaaaa 
ccccataact 
gaactgcctc 
gcgccattag 
ttgcttcctt 
ttattccaga 
agacccaatc 
gcagttactt 
aaagggcaac 
aacatctgtg 
ttgggctggg 
ccaacctggc 
tcagctggcc 
gagaattgct 
gcaatccagc 
aggatcgtct 
tgaaggtacc 
ATGAAGCAGA 



ggcactgcat 
cccttcctcc 
cctacgccca 
tcctagggaa 
gaagacttct 
tgctgctgga 
atgttatcac 
ccatgaggtg 
atttagggta 
actcctagga 
gtcattttct 
ggggtcagcc 
tttcctggac 
acgtgggagg 
tcacgggtca 
aatttcaatc 
gcctgttcag 
tgtggaggca 
caacatgatg 
gtcgtggtgt 
tgaacccagg 
ctgggcgacg 
caacctttgc 
ttccatactt 
TTCC7AACCT 



tgccctgtgc 
acaatttcct 
gcactggctt 
gtgttcggga 
tgggaccagg 
gaactgcccc 
tggccccaac 
gcagtcagct 
gtacacaagc 
actggtagat 
cactgcctgg 
caaagctgtc 
actgtcaccc 
aagtgtaact 
gaacacgcag 
ttgcttctag 
caaaggatgt 
agtgaatcac 
aaaccccatc 
gcctgtagtc 
aggtggtggt 
gagtgagact 
cctcctactg 
atgctgttaa 
GCTTCCTAAT 



GTCCTCACCT G TGTGTAGAG r.AGGAGGAr.A 



CTGATCCAGT CACAGCCATA r AGCTGTCCA r.AGTGAAGAA CGTGTCrTAr 
AACAGCCTGA ATCAAATGGT TAGCTTAATA GATAAAAATC CCAGACTATT 
TCAGCCTTTA ATGCCTTTTA TTCATAAAM r.TGTGAAAGC TAGAGTGAAG 
CATTGGAAAC ATTTAACTCA GACTCTGGAT TGAGAGTCGG GAACrCTTAG 
TTCTATCTGA ATCCAAGAGA GCCACACCTT AGTATAGTGr rGAAAGTAAT 
GAGTTTAATA AAT AGAAATA GTGGT (SEQ. ID. NO. : 1) 



FIG.1F 
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CA6GGAGTCCCACCAGCCTAGTCGCCAGACCTTCTGTGGGATCATCGGAC 50 
CCACCTGGAACCCCACCTGACCCAAGCCCACCTGCTGCAGCCCACTGCCT 100 
GGCCATGACCATCACTTACACAAGCCAAGTGGCTAATGCCCGCTTAGGCT 150 
CCTTCTCCCGCCTGCTGCTGTGCTGGCGGGGCAGCATCTACAAGCTGCTA 200 
TATGGCGAGTTCTTAATCTTCCTGCTCTGCTACTACATCATCCGCTTTAT 250 
TTATAGGCTGGCCCTCACGGAAGAACAACAGCTGATGTTTGAGAAACTGA 300 
CTCTGTATTGCGACAGCTACATCCAGCTCATCCCCATTTCCTTCGTGCTG 350 
GGCTTCTACGTGACGCTGGTCGTGACCCGCTGGTGGAACCAGTACGAGAA 400 
CCTGCCGTGGCCCGACCGCCTCATGAGCCTGGTGTCGGGCTTCGTCGAAG 450 
GCAAGGACGAGCAAGGCCGGCTGCTGCGGCGCACGCTCATCCGCTACGCC 500 
AACCTGGGCAACGTGCTCATCCTGCGCAGCGTCAGCACCGCAGTCTACAA 550 
GCGCTTCCCCAGCGCCCAGCACCTGGTGCAAGCAGGCTTTATGACTCCGG 600 
CAGAACACAAGCAGTTGGAGAAACTGAGCCTACCACACAACATGTTCTGG 650 
GTGCCCTGGGTGTGGTTTGCCAACCTGTCAATGAAGGCGTGGCTTGGAGG 700 
TCGAATCCGGGACCCTATCCTGCTCCAGAGCCTGCTGAACGAGATGAACA 750 
CCTTGCGTACTCAGTGTGGACACCTGTATGCCTACGACTGGATTAGTATC 800 
CCACTGGTGTATACACAGGTGGTGACTGTGGCGGTGTACAGCTTCTTCCT 850 
GACTTGTCTAGTTGGGCGGCAGTTTCTGAACCCAGCCAAGGCCTACCCTG 900 
GCCATGAGCTGGACCTCGTTGTGCCCGTCTTCACGTTCCTGCAGTTCTTC 950 
TTCTATGTTGGCTGGCTGAAGGTGGCAGAGCAGCTCATCAACCCCTTTGG 1000 
AGAGGATGATGATGATTTTGAGACCAACTGGATTGTCGACAGGAATTTGC 1 050 
AGGTGTCCCTGTTGGCTGTGGATGAGATGCACCAGGACCTGCCTCGGATG 1100 
GAGCCGGACATGTACTGGAATAAGCCCGAGCCACAGCCCCCCTACACAGC 1 1 50 
TGCTTCCGCCCAGTTCCGTCGAGCCTCCTTTATGGGCTCCACCTTCAACA 1200 
TCAGCCTGAACAAAGAGGAGATGGAGTTCCAGCCCAATCAGGAGGACGAG 1 250 
GAGGATGCTCACGCTGGCATCATTGGCCGCTTCCTAGGCCTGCAGTCCCA 1300 
TGATCACCATCCTCCCAGGGCAAACTCAAGGACCAAACTACTGTGGCCCA 1350 
AGAGGGAATCCCTTCTCCACGAGGGCCTGCCCAAAAACCACAAGGCAGCC 1400 
AAACAGAACGTTAGGGGCCAGGAAGACAACAAGGCCTGGAAGCTTAAGGC 1450 
TGTGGACGCCTTCAAGTCTGGCCCACTGTATCAGAGGCCAGGCTACTACA 1500 
GTGCCCCACAGACGCCCCTCAGCCCCACTCCCATGTTCTTCCCCCTAGAA 1550 
CCATCAGCGCCGTCAAAGCTTCACAGTGTCACAGGCATAGACACCAAAGA 1600 
CAAMGCTTAMGACTGTGAGTTCTGGGGCCAAGAAAAGTTTTGAATTGC 1650 
TCTCAGAGAGCGATGGGGCCTTGATGGAGCACCCAGAAGTATCTCAAGTG 1700 
AGGAGGAAAACTGTGGAGTTTAACCTGACGGATATGCCAGAGATCCCCGA 1750 
AAATCACCTCAAAGAACCTTTGGAACAATCACCAACCAACATACACACTA 1800 
CACTCAAAGATCACATGGATCCTTATTGGGCCTTGGAAAACAGGGATGAA 1850 
GCACATTCCTAACCTGCTTCCTAATGGGGATGCTTCGCCAGCCAGGTCCT 1900 
CACCTGTGTGTACACCAGCAGGACACTGATCCAGTCACAGCCATACAGCT 1950 
GTCCACACTGAAGAACGTGTCCTACAACAGCCTGAATCAAATGGTTAGCT 2000 
TMTAGATAAAMTCCCAGACTACTTCAGCCTTTAATGCCTTTTATTCAT 2050 
AAAAACTGTGAAAGCTAGACTGAACCATTGGAAACATTTAACTCAGACTC 2100 
TGGATTCAGAGTCGGGAACCCTTAGTTCTATCTGAATCCAAGACAGCCAC 2150 
ACCTTAGTATACTGCCCAAACTAATGAGTTTAATAAATACAAATACTCGT 2200 
TAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA(SEQ. ID. NO. : 2) 
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MTITYTSQVANARLGSFSRLLLCWRGSIYKLLYGEFLIFLLCYYIIRFIY 50 
RLALTEEQQLMFEKLTLYCDSYIQLIPISFVLGFYVTLVVTRWWNQYENL 100 
PWPDRLMSLVSGFVEGKDEQSRLLRRTLIRYANLGNVLILRSVSTAVYKR 150 
FPSAQHLVQAGFMTPAEHKQLEKLSLPHNMFWVPWVWFANLSMKAWLGGR 200 
IRDPILLQSLLNEMNTLRTQCGHLYAYDWISIPLVYTQVVTVAVYSFFLT 250 
CLVGRQFLNPAKAYPGHELDLVVPVFTFLQLFLYVGWLKVAEQLINPFGE 300 
DDDDFETNWIVDRNLQVSLLAVDEMHQDLPRMEPDMYWNKPEPQPPYTAA 350 
SAQFRRASFMGSTFNISLNKEEMEFQPNQEDEEDAHAGI IGRFLGLQSHD 400 
HHPPRANSRTKLLWPKRESLLHEGLPKNHKAAKQNVRGQEDNKAWKLKAV 450 
DAFKSGPLYQRPGYYSAPQTPLSPTPMFFPLEPSAPSKLHSVTGIDTKDK 500 
SLKTVSSGAKKSFELLSESDGALMEHPEVSQVRRKTVEFNLTDMPEIPEN 550 
HLKEPLEQSPTNIHTTLKDHMDPYWALENRDEAHS (SEQ. ID. NO. :3) 



FIG. 3 
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CAGGGAGTCCCACCAGCCTAGTCGCCAGACCTTCTGTGGGATCATCGGAC 50 
CCACCTGGAACCCCACCTGACCCAAGCCCACCTGCTGCAGCCCACTGCCT 100 
GGCCATGACCATCACTTACACAAGCCAAGTGGCTAATGCCCGCTTAGGCT 150 
CCTTCTCCCGCCTGCTGCTGTGCTGGCGGGGCAGCATCTACAAGCTGCTA 200 
TATGGCGAGTTCTTAATCTTCCTGCTCTGCTACTACATCATCCGCTTTAT 250 
TTATAGGCTGGCCCTCACGGAAGAACAACAGCTGATGTTTGAGAAACTGA 300 
CTCTGTATTGCGACAGCTACATCCAGCTCATCCCCATTTCCTTCGTGCTG 350 
GGCTTCTACGTGACGCTGGTCGTGACCCGCTGGTGGAACCAGTACGAGAA 400 
CCTGCCGTGGCCCGACCGCCTCATGAGCCTGGTGTCGGGCTTCGTCGAAG 450 
GCAAGGACGAGCAAGGCCGGCTGCTGCGGCGCACGCTCATCCGCTACGCC 500 
AACCTGGGCAACGTGCTCATCCTGCGCAGCGTCAGCACCGCAGTCTACAA 550 
GCGCTTCCCCAGCGCCCAGCACCTGGTGCAAGCAGGCTTTATGACTCCGG 600 
C AGAAC AC AAGC AGTTGGAGAAACTGAGC CTAC CACAC AAC ATGTTCTGG 650 
GTGCCCTGGGTGTGGTTTGCCAACCTGTCAATGAAGGCGTGGCTTGGAGG 700 
TCGAATCCGGGACCCTATCCTGCTCCAGAGCCTGCTGAACGAGATGAACA 750 
CCTTGCGTACTCAGTGTGGACACCTGTATGCCTACGACTGGATTAGTATC 800 
CCACTGGTGTATACACAGGTGGTGACTGTGGCGGTGTACAGCTTCTTCCT 850 
GACTTGTCTAGTTGGGCGGCAGTTTCTGAACCCAGCCAAGGCCTACCCTG 900 
GCCATGAGCTGGACCTCGTTGTGCCCGTCTTCACGTTCCTGCAGTTCTTC 950 
TTCTATGTTGGCTGGCTGAAGGTGGGCCTCTCCAGGGCCCTGCTGGGCTG 1000 
GAGGCATGGCCAGAGGGGTCATGGCCAGCAGCTGCTTGAGACGAGGATGC 1050 
AGTGTCAGGAAAGGAAGGTCTCACGGGTAGAAAGCAGCCAGGCGTGGTGG 1100 
CGCACACCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCT 1 150 
TGAACCCGGGAGGCGGAGGTTGTGGTGGCAGAGCAGCTCATCAACCCCTT 1200 
TGGAGAGGATGATGATGATTTTGAGACCAACTGGATTGTCGACAGGAATT 1 250 
TGCAGGTGTCCCTGTTGGCTGTGGATGAGATGCACCAGGACCTGCCTCGG 1300 
ATGGAGCCGGAC ATGTACTGGAATAAGCCCGAGCCAC AGCCCCCCTACAC 1350 
AGCTGCTTCCGCCCAGTTCCGTCGAGCCTCCTTTATGGGCTCCACCTTCA 1400 
ACATCAGCCTGAACAAAGAGGAGATGGAGTTCCAGCCCAATCAGGAGGAC 1450 
GAGGAGGATGCTCACGCTGGCATCATTGGCCGCTTCCTAGGCCTGCAGTC 1500 > 
CCATGATCACCATCCTCCCAGGGCAAACTCAAGGACCAAACTACTGTGGC 1 550 
CCAAGAGGGMTCCCTTCTCCACGAGGGCCTGCCCAAAAACCACAAGGCA 1600 
GCCAAACAGAACGTTAGGGGCCAGGAAGACAACAAGGCCTGGAAGCTTAA 1 650 
GGCTGTGGACGCCTTCAAGTCTGGCCCACTGTATCAGAGGCCAGGCTACT 1700 
ACAGTGCCCCACAGACGCCCCTCAGCCCCACTCCCATGTTCTTCCCCCTA 1750 
GAACCATCAGCGCCGTCAAAGCTTCACAGTGTCACAGGCATAGACACCAA 1800 
AGACAAAAGCTTAAAGACTGTGAGTTCTGGGGCCAAGAAAAGTTTTGAAT 1 850 
TGCTCTCAGAGAGCGATGGGGCCTTGATGGAGCACCCAGAAGTATCTCAA 1900 
GTGAGGAGGAAAACTGTGGAGTTTAACCTGACGGATATGCCAGAGATCCC 1 950 
CGAAAATCACCTCAAAGAACCTTTGGAACAATCACCAACCAACATACACA 2000 
CTACACTCAAAGATCACATGGATCCTTATTGGGCCTTGGAAAACAGGGAT 2050 
GAAGCACATTCCTAACCTGCTTCCTAATGGGGATGCTTCGCCAGCCAGGT 2100 
CCTCACCTGTGTGTACACCAGCAGGACACTGATCCAGTCACAGCCATACA 2150 
GCTGTCCACACTGAAGAACGTGTCCTACAACAGCCTGAATCAAATGGTTA 2200 
GCTTMTAGATAAAAATCCCAGACTACTTCAGCCTTTAATGCCTTTTATT 2250 
CATAAAAACTGTGAAAGCTAGACTGAACCATTGGAAACATTTAACTCAGA 2300 
CTCTGGATTCAGAGTCGGGAACCCTTAGTTCTATCTGAATCCMGACAGC 2350 
CACACCTTAGTATACTGCCCAAACTAATGAGTTTAATAAATACAAATACT 2400 
CGTTAAAAAAAAAAAAAAAAAAAAAAAAAA(SEQ. ID. NO. :4) 
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MTITYTSQVANARLGSFSRLLLCWRGSIYKLLYGEFLIFLLCYYIIRFIY 50 
RLALTEEQQLMFEKLTLYCDSYIQLIPISFVLGFYVTLVVTRWWNQYENL 100 
PWPDRLMSLVSGFVEGKDEQGRLLRRTLIRYANLGNVLILRSVSTAVYKR 150 
FPSAQHLVQAGFMTPAEHKQLEKLSLPHNMFWVPWVWFANLSMKAWLGGR 200 
IRDPILLQSLLNEMNTLRTQCGHLYAYDWISIPLVYTQVVTVAVYSFFLT 250 
CLVGRQFLNPAKAYPGHELDLVVPVFTFLQFFFYVGWLKVGLSRALLGWR 300 
HGQRGHGQQLLETRMQCQERKVSRVESSQAWWRTPV I PATREAEAGESLE 350 
PGRRRLWWQSSSSTPLERMMMILRPTGLSTGICRCPCWLWMRCTRTCLGW 400 
SRTCTGISPSHSPPTQLLPPSSVEPPLWAPPSTSA (SEQ.ID.NO. :5) 



FIG. 5 
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GenBank/SwissProt Protein sequence SEQ. ID. NO. 

accession numbers 



CGlCE_protein 


IPISFVLGFY 


VTLVVTRWWN 

¥ 1 I — ¥ W 1 1 ^ WW ¥ 111 


OYENLPWPDR 


2 (Dart) 


af016687 (PID:g2315833) 


IPLTFMLGFF 


VTIIVGRWND 


TFI NTGWVnM 


CO 


Z73105 


(PID:e242363) 


IPLTFMLGFF 


VTIIVRRWND 


IFANLGWVFN 


29 


Z73422 


(PID:e244423) 


IPLEFVLGFE 


VTIVVDRWTK 


LWRTVGFIDD 


30 


Z73422 


(PID:e244542) 


IPLEFVLGFE 


VTTVVNRWTK 


LYQTIGFIDN 


31 


p34577 




VPLDWMLGFE 


IAGVLRREWY 


LYDIIGFIDN 


32 


p34672 




IPLNFMLGFE 


VTAVVNRWTY 


LYQIIGFIDN 


33 


p34319 




LPLNFVLGFE 


CNIIIRRWLK 


LYTSLGNIDN 


34 


Z68335 


(PID:e217363) 


IPINFMLGFE 


VTTVINRWMT 


QFANLGMIDN 


35 


Z68753 


(PID:e218704) 


IPLTFLLGFE 


VSFVVARWGS 


ILNGIGWIDD 


36 


af025458 (PID:e2429439) 


IPVTFMLGFY. 


VSIVYNRWTK 


VFDNVGWIDT 


37 


U28412 


(PID:g849242) 


LPLTFMLGFE 


VTTVFERWRS 


ALNVMPFIES 


38 


U70848 


(PID:gl572760) 


IPLTFLLGFI 


VSNVVSRWWR 


QFETLRWPED 


39 


Z81074 


(PID:e351507) 


IPLTFLLGFY 


VSNVVARWWR 


QFETLYWPED 


40 


q09379 




IPLTFLLGFI 


VAMIVRRWWD 


CCQLISWPDH 


41 


Z72509 


(PID:e239377) 


IPLSFLLGFE 


VSLIVARWWE 


QFNCISWPDK 


42 


Z83221 


(PID:e349023) 


VPMQPMLGYE 


IGMVGERWGE 


SFENVSYIEK 


43 
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1 GTGCCAAGCCATGACTATCACCTACACAAACAAAGTAGCCAATGCCCGCCTCGGTTCGTT 60 
1 MTITYTNKVANARLGSF17 

6 1 CTCGTCCCTCCTCCTGTGCTGGCGAGGCAGCATCTACAAGCTGCTGTATGGAGAATTCCT 120 
18SSLLLCWRGSIYKLLYGEFL37 

121 TGTCTTCATATTCCTCTACTATTCCATCCGTGGACTCTACAGAATGGTTCTCTCGAGTGA 180 
38VFIFLYYSIRGLYRMVLSSD57 

181 TCAGCAGCTGTTGTTTGAGAAGCTGGCTCTGTACTGCGACAGCTACATTCAGCTCATCCC 240 
58QQLLFEKLALYCDSYIQL IP 77 

241 TATATCCTTCGTTCTGGGTTTCTATGTTACATTGGTGGTGAGCCGCTGGTGGAGCCAGTA 300 
78ISFVLGFYVTLVVSRWWSQY97 

301 CGAGAACTTGCCGTGGCCCGACCGCCTCATGATCCAGGTGTCTAGCTTCGTGGAGGGCAA 360 
98ENLPWPDRLMIQVSSFVEGK 117 

361 GGATGAGGAAGGCCGTTTGCTGCGGCGCACGCTCATCCGCTACGCCATCCTGGGCCAAGT 420 
118 DEEGRLLRRTLIRYAILGQV 137 

421 GCTCATCCTGCGCAGCATCAGCACCTCGGTCTACAAGCGCTTTCCCACTCTTCACCACCT 480 
138 LILRSISTSVYKRFPTLHHL 157 

481 GGTGCTAGCAGGTTTTATGACCCATGGGGAACATAAGCAGTTGCAGAAGTTGGGCCTACC 540 
158 VLAGFMTHGEHKQLQKLGLP 177 

541 ACACAACACATTCTGGGTGCCCTGGGTGTGGTTTGCCAACTTGTCAATGAAGGCCTATCT 600 
178 HNTFWVPWVWFANLSMKAYL 197 

601 TGGAGGTCGAATCCGGGACACCGTCCTGCTCCAGAGCCTGATGAATGAGGTGTGTACTTT 660 
198 GGRIRDTVLLQSLM X NEVCTL 217 

661 GCGTACTCAGTGTGGACAGCTGTATGCCTACGACTGGATAAGTATCCCATTGGTGTACAC 720 
218 RTQCGQLYAYDWISIPLVYT 237 

721 ACAGGTGGTGACAGTGGCAGTATACAGCTTTTTCCTTGCATGCTTGATCGGGAGGCAGTT 780 
238 QVVTVAVYSFFLACL IGRQF 257 
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781 TCTGAACCCAAACAAGGACTACCCAGGCCATGAGATGGATCTGGTTGTGCCTGTCTTCAC 840 
258 LNPNKDYPGHEMDLVVPVFT 277 

841 AATCCTGCAATTCTTATTCTACATGGGCTGGCTGAAGGTGGCAGAACAGCTCATCAACCC 900 
278 ILQFLFYMGWLKVAEQLINP 297 

901 CTTCGGGGAGGACGATGATGATTTTGAGACTAACTGGATCATTGACAGAAACCTGCAGGT 960 
298 FGEDDDDFETNWI IDRNLQV317 

961 GTCCCTGTTGTCCGTGGATGGGATGCACCAGAACTTGCCTCCCATGGAACGTGACATGTA 1020 
318 SLLSVDGMHQNLPPMERDMY 337 

1021 CTGGAACGAGGCAGCGCCTCAGCCGCCCTACACAGCTGCTTCTGCCAGGTCTCGCCGGCA 1 080 
338 WNEAAPQPPYTAASARSRRH 357 

1081 TTCCTTCATGGGCTCCACCTTCAACATCAGCCTAAAGAAAGAAGACTTAGAGCTTTGGTC 1140 
358 SFMGSTFNISLKKEDLELWS 377 

1141 AAAAGAGGAGGCTGACACGGATAAGAAAGAGAGTGGCTATAGCAGCACCATAGGCTGCTT 1200 
378 KEEADTDKKESGYSSTIGCF 397 

1201 CTTAGGACTGCAACCCAAAAACTACCATCTTCCCTTGAAAGACTTAAAGACCAAACTATT 1260 
398 LGLQPKNYHLPLKDLKTKLL 417 

1261 GTGTTCTAAGAACCCCCTCCTCGAAGGCCAGTGTAAGGATGCCAACCAGAAAAACCAGAA 1320 
418 CSKNPLLEGQCKDANQKNQK 437 

1321 AGATGTCTGGAAATTTAAGGGTCTGGACTTCTTGAAATGTGTTCCAAGGTTTAAGAGGAG 1380 
438 DVWKFKGLDFLKCVPRFKRR 457 

1381 AGGCTCCCATTGTGGCCCACAGGCACCCAGCAGCCACCCTACTGAGCAGTCAGCACCCTC 1440 
458 GSHCGPQAPSSHPTEQSAPS 477 

1441 CAGTTCAGACACAGGTGATGGGCCTTCCACAGATTACCAAGAAATCTGTCACATGAAAAA 1500 
478 SSDTGDGPSTDYQEICHMKK 497 

1501 GAAAACTGTGGAGTTTAACTTGAACATTCCAGAGAGCCCCACAGAACATCTTCAACAGCG 1560 
498 KTVEFNLNIPESPTEHLQQR 517 

1 561 CCGTTTGGACCAGATGTCAACCAATATACAGGCTCTAATGAAGGAGCATGCAGAGTCCTA 1620 
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518 RLDQMSTNIQALMKEHAESY 537 

1621 TCCCTACAGGGATGAAGCTGGCACCAAACCTGTTCTCTATGAGTGATGCCTCACAGCCTG 1680 
538 PYRDEAGTKPVLYE 551 

1681 GCCCTGACTTGCAAGGATGCCCAGCAGGGCACTGACCCAGTCAAAGGCACACAAGCAGCG 1740 

1741 ACACCCAGGAGTGTGTTCCCACGACAGTCTAGCATGTAACTCAGAACCAAGAGTACTTAA 1800 

1801 TAGTCCTGCCTGAAAACACCTGTATTTTACGATCTTTCCCAAACTAAGGAGTTTMIAM I860 
1861 CGTGAATATTCTTTTAGGTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1916 

FIG.8C 
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1 50 
Human MTITYTSQVA NARLGSFSRL LLCWRGSIYK LLYGEFLIFL LCYYIIRFIY 
MouseBestrophin] MTITYTNKVA NARLGSFSSL LLCWRGSIYK LLYGEFLVFI FLYYSIRGLY 

51 100 
Human RLALTEEQQL MFEKLTLYCD SYIQLIPISF VLGFYVTLVV TRWWNQYENL 
MouseBestrophin] RMVLSSDQQL LFEKLALYCD SYIQLIPISF VLGFYVTLVV SRWWSQYENL 

101 • 150 

Human PWPDRLMSLV SGFVEGKDEQ GRLLRRTLIR YANLGNVLIL RSVSTAVYKR 
MouseBestrophin] PWPDRLMIQV SSFVEGKDEE GRLLRRTLIR YAILGQVLIL RSISTSVYKR 

151 200 
Human FPSAQHLVQA GFMTPAEHKQ LEKLSLPHNM FWVPWVWFAN LSMKAWLGGR 
MouseBestrophin] FPTLHHLVLA GFMTHGEHKQ LQKLGLPHNT FWVPWVWFAN LSMKAYLGGR 

201 250 
Human IRDPILLQSL LNEMNTLRTQ CGHLYAYDWI SIPLVYTQVV TVAVYSFFLT 
MouseBestrophin] IRDTVLLQSL MNEVCTLRTQ CGQLYAYDWI SIPLVYTQVV TVAVYSFFLA 

251 300 
Human CLVGRQFLNP AKAYPGHELD LVVPVFTFLQ FFFYVGWLKV AEQLINPFGE 
MouseBestrophin] CLIGRQFLNP NKDYPGHEMD LVVPVFTILQ FLFYMGWLKV AEQLINPFGE 

301 350 
Human DDDDFETNWI VDRNLQVSLL AVDEMHQDLP RMEPDMYWNR PEPQPPYTAA 
MouseBestrophin] DDDDFETNWI IDRNLQVSLL SVDGMHQNLP PMERDMYWNE AAPQPPYTAA 

351 400 
Human SAQFRRASFM GSTFNISLNK EEMEFQPNQE ....DEEDAH AGI IGRFLGL 
MouseBestrophin] SARSRRHSFM GSTFNISLKK EDLELWSKEE ADTDKKESGY SSTIGCFLGL 

401 450 
Human QSHDHHPPRA NSRTKLLWPK RESLLHEGLP KNHKAAKQNV RGQEDNKAWK 
MouseBestrophin] QPKNYHLPLK DLKTKLLCSK NPLL..EGQC KD ANQ KNQKD. .VWK 

\ 

451 500 
Human LKAVDAFKSA PLYQRPGYYS APQTPLSPTP MFFPLEPSAP SKLHSVTGID 
MouseBestrophin] FRGLDFLKCV PRFKRRGSHC GPQAPSS HPTEQSAP SS..SDTG.. 

501 550 
Human TKDKSLKTVS SGAKKSFELL SESDGALMEH PEVSQVRRKT VEFNLTDMPE 
MouseBestrophin] DGPSTDY QEICHMKKKT VEFNL.NIPE 

551 596 

Human IPENHLKE.P LEQSPTNIHT TLKDHMDPYW ALENRDEAHS 

MouseBestrophin] SPTEHLQQRR LDQMSTNIQA LMKEHAESY. . . PYRDEAGT KPVLYE 
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Petrukhin, et al. 
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Application Number 



Filing Date 



Group Art Unit 



Examiner Name 



As a below named inventor, I hereby declare that: 



My residence, post office address, and citizenship are as stated below next to my name. 



I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if 
plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled: 



BEST'S MACULAR DYSTROPHY GENE 



the specification of which 

I 1 is attached hereto 
OR 

|X| was filed on (MM/DD/YYYY) 



(Title of the Invention) 



08/23/2000 



Application Number 



09/622,964 



as United States Application Number or PCT International 

(if applicable). 



and was amended on (MM/DD/YYYY) 



I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as 
amended by any amendment specifically referred to above. 

I acknowledge the duty to disclose to the Patent and Trademark Office all information known to me to be material to patentability 
as defined in 37 CFR 1.56. 



I hereby claim foreign priority benefits under 35 U SC. 1 19(a)-(d) or 365(b) of any foreign apphcation(s) for patent or inventor's 
certificate, or 365(a) of any PCT international application which designated at least one country other than the United States of 
America, listed below and have also identified below, by checking the box, any foreign application for patent or inventor's 
certificate, or of any PCT international application having a filing date before that of the application on which priority is claimed. 



Prior Foreign Application 
Number(s) 


Country 


Foreign Filing Date 
(MM/DD/YYYY) 


Attorney Docket Number 


Priority Claimed? 
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Additional foreign application numbers are listed on a supplemental priority data sheet PTO/SB/02B attached hereto: 



I hereby claim the benefit under 35 U.S C. 1 19(e) of any United States provisional application(s) listed below 
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Filing Date 
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Attorney Docket Number 
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60/075,941 


02/25/1998 ^ 
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is not disclosed in the prior United States or PCT international application in the manner provided by the first paragraph of 

35 U.S.C. 1 12, I acknowledge the duty to disclose information known to me to be matenal to patentability as defined in 

37 CFR 1.56 which became available between the filing date of the prior application and the national or PCT international filing 

date of this application. 
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Application Number 



PCT/US99/03790 



Parent Filing Date 
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02/22/1999 



Parent Patent Number 
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| [ Additional U S or PCT international application numbers are listed on a supplemental priority data sheet PTO/SB/02B attached hereto 
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following registered practitioner(s) to prosecute this application and to transact all business in the Patent and Trademark Office 
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38,413 
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I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as 
amended by any amendment specifically referred to above. 



I acknowledge the duty to disclose to the Patent and Trademark Office all information known to me to be material to patentability 
as defined in 37 CFR 1.56. 



I hereby claim foreign priority benefits under 35 U.S.C. 1 19(a)-(d) or 365(b) of any foreign application(s) for patent or inventor's 
certificate, or 365(a) of any PCT international application which designated at least one country other than the United States of 
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certificate, or of any PCT international application having a filing date before that of the application on which priority is claimed. 
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DECLARATION AND 
POWER OF ATTORNEY 
FOR UTILITY OR DESIGN 
PATENT APPLICATION 
(37 CFR 1.63) 



j j Declaration 



Submitted 
with Initial 
Filing 



fv^l Declaration 
1^ Submitted after Initial 
OR Filing (surcharge 

(37 CFR 1 16(e)) 
required) 



Attorney Docket Number 



First Named Inventor 



20177YP 



Petrukhin, et al. 



COMPLETE IF KNOWN 



Application Number 



Filing Date 



Group Art Unit 



Examiner Name 



As a below named inventor, I hereby declare that: 

My residence, post office address, and citizenship are as stated below next to my name. 

I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if 
plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled: 



BEST'S MACULAR DYSTROPHY GENE 



the specification of which 

| | is attached hereto 
OR 

1X1 was filed on (MM/DD/YYYY) 



(Title of the Invention) 



08/23/2000 



Application Number 



09/622,964 , 



as United States Application Number or PCT International 

(if applicable). 



and was amended on (MM/DD/YYYY) 



I hereby state that I have reviewed and understand the contents of the above identified specification, including the claims, as 
amended by any amendment specifically referred to above. 

I acknowledge the duty to disclose to the Patent and Trademark Office all information known to me to be material to patentability 
as defined in 37 CFR 1.56. 



I hereby claim foreign priority benefits under 35 U.S.C. 1 19(a)-(d) or 365(b) of any foreign application(s) for patent or inventor's 
certificate, or 365(a) of any PCT international application which designated at least one country other than the United States of 
America, listed below and have also identified below, by checking the box, any foreign application for patent or inventor's 
certificate, or of any PCT international application having a filing date before that of the application on which priority is claimed. 



Prior Foreign Application 
Number(s) 



Country 



Foreign Filing Date 
(MM/DD/YYYY) 



Attorney Docket Number 



Priority Claimed? 
YES NO 



□ □ 



□ □ 



□ □ 



□ □ 



| | Additional foreign application numbers are listed on a supplemental priority data sheet PTO/SB/02B attached hereto 



I hereby claim the benefit under 35 U S C 1 19(e) of any United States provisional application(s) listed below 



Application Number(s) 



Filing Date 
(MM/DD/YYYY) 



Attorney Docket Number 



60/112,926 _ 



12/18/1998 



20177PV2 



60/075,941 



02/25/1998^ 



20177PV 
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DECLARATION AND POWER OF ATTORNEY for Utility or Design Patent Application 



J 



I hereby claim the benefit under 35 U.S.C 120 of any United States apphcation(s), or 365(c) of any PCT international application 
designating the United States of America, listed below and, insofar as the subject matter of each of the claims of this application 
is not disclosed in the prior United States or PCT international application in the manner provided by the first paragraph of 
35 U.S.C. 1 12, I acknowledge the duty to disclose information known to me to be material to patentability as defined in 
37 CFR 1.56 which became available between the filing date of the prior application and the national or PCT international filing 
date of this application. 



U.S. Parent Application or PCT Parent 
Application Number 



Parent Filing Date 
(MM/DD7Y Y Y Y ) 



Parent Patent Number 
(if applicable) 



PCT/US99/03790 



02/22/1999 



| | Additional U S or PCT international application numbers are listed on a supplemental priority data sheet PTO/SB/02B attached hereto 



As a named inventor, I hereby appoint, respectively and individually, as my attorneys or agents with full power of substitution and revocation, the 
following registered practitioner(s) to prosecute this application and to transact all business in the Patent and Trademark Office 

connected therewith: 

| | Customer Number 



OR 



| X 1 Registered practitioner(s) name/registration number listed below 



Place Customer Number 
Bar Code Label here 



Name 



Registration 
Number 



Name 



Registration 
Number 



Joseph A. Coppola 



38,413 



Jack L. Tribble 



Direct all correspondence to: \x] Customer Number or Bar Code LabeK^^ 000210 j ) 



Name 



Joseph A. Coppola 



Address 



Merck & Co., Inc. - Patent Department 



Address 



P.O. Box 2000, RY60-30 



City 



Rahway 



State 



NJ 



ZIP 



07065-0907 



Country 



USA 



Telephone (732)594-6734 . 



Fax 



(732)594-4720 



I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and 
belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under 18 U S.C. 1001 and that such willful false statements 
may jeopardize the validity of the application or any patent issued thereon. 



Name of Sole or First Inventor: 



| | A petition has been filed for this unsigned inventor 



Given Name (first and middle [if any]) 



Family Name or Surname 



Konstantin 



Petrukhin 



Inventor's 
Signature 



Date 



Residence: 
City 



Collegeville 



State 



PA 



Country 



US 



Citizenship 



RU 



Post Office 
Address 



Merck & Co., Inc., P.O. Box 2000 



City 



Rahway 



State NJ 



ZIP 



07065-0907 



|X| Additional inventors are being named on the 1 supplemental Additional lnventors(s) sheet(s) PTO/SB/02A attached hereto. 
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Name of Additional Joint Inventor, if any: 


| | A petition has been filed for this unsigned inventor 


Given Name (first and middle [if any]) 


Family Name or Surname 


C. Thomas 


Caskey 


Inventor's 
Signature 




Date 




Residence: 
Citv 


Lansdale 


State 


PA 


Country 


US 


Citizenship 


US 


Post Office 
Address 


Merck & Co., Inc., P.O. Box 2000 


City 


Rah way 


State 


NJ 


ZIP 


07065-0907 


Name of Additional Joint Inventor, if any: 


| | A petition has been filed for this unsigned inventor 


Given Name (first and middle [if any]) 


Family Name or Surname 


Michael 


Metzker 


Inventor's 
Signature 




Date 




Residence: 
Citv 


Fort Washington 


State 


PA 


Country 


US 


Citizenship 


us 


Post Office 
Address 


Merck & Co., Inc., P.O. Box 2000 


City 


Rah way 


State 


NJ 


ZIP 


07065-0907 


Name of Additional Joint Inventor, if any: 


| | A petition has been filed for this unsigned inventor 


Given Name (first and middle [if any]) 


Family Name or Surname 


)Claes 


Wadehus 


Inventor's 
Signature 


— 


Date 




Residence: 
Citv 


Upsala <^£^~ 


State 




Country 


Sweden 


Citizenship 


SE ^ 


Post Office 
Address 


Merck & Co., Inc., P.O. Box 2000 


City 


Rah way 


State 


NJ 


ZIP 


07065-0907 


Name of Additional Joint Inventor, if any: 


| | A petition has been filed for this unsigned inventor 


Given Name (first and middle [if any]) 


Family Name or Surname 






Inventor's 
Signature 




Date 




Residence: 
City 




State 




Country 




Citizenship 




Post Office 
Address 


Merck & Co., Inc., P.O. Box 2000 


City 


Rahway 


State 


NJ 


ZIP 


07065-0907 
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